Getting Started with Axon
开始使用 Axon

Multicolored bird on a twig

Machine Learning Advisor
机器学习顾问

Sean Moriarity  肖恩·莫里亚蒂

You can follow along by importing the Livebook from here: https://gist.github.com/seanmor5/dc077ea5dcc44f6e9d4fbfb34d834552
您可以通过从此处导入 Livebook 来跟进:https://gist.github.com/seanmor5/dc077ea5dcc44f6e9d4fbfb34d834552

Preface 前言

NOTE: Before reading, I highly recommend checking out my first post which serves as an introduction to Nx and the Nx ecosystem.
注意:在阅读之前,我强烈建议您查看我的第一篇文章,该文章介绍了 Nx 和 Nx 生态系统。

This XKCD was published in September of 2014, roughly two years following deep learning’s watershed moment – AlexNet. At the time, deep learning was still in its nascent stages, and classifying images of “bird or no bird” might have seemed like an impossible task. Today, thanks to neural networks, we can “solve” this task in roughly 30 minutes – which is precisely what we’ll do with Elixir and Axon.
这份 XKCD 于 2014 年 9 月发布,距深度学习的分水岭时刻 —— AlexNet 大约过去了两年。当时,深度学习仍处于初级阶段,对“有鸟或无鸟”的图像进行分类似乎是一项不可能完成的任务。今天,多亏了神经网络,我们可以在大约 30 分钟内“解决”这个任务 —— 这正是我们将使用 Elixir 和 Axon 所做的。

NOTE: To say the problem of computer vision is “solved” is debatable. While we’re able to achieve incredible performance on image classification, image segmentation, object detection, etc., there are still many open problems in the field. Models still fail in hilarious ways. In this context, “solved” really means suitable accuracy for the purposes of this demonstration.
注意:说计算机视觉问题“已解决”是值得商榷的。虽然我们能够在图像分类、图像分割、目标检测等方面取得令人难以置信的性能,但该领域仍然存在许多未解决的问题。模型仍然以滑稽的方式失败。在这种情况下,“已解决”实际上意味着适合本演示目的的准确性。

Introduction 介绍

Axon is a library for creating neural networks for the Elixir programming language. The library is built entirely on top of Nx, which means it can be combined with compilers such as EXLA to accelerate programs with “just-in-time” (JIT) compilation to the CPU, GPU, or TPU. What does that actually mean? Somebody has taken care of the hard work for us! In order to take advantage of our hardware, we need optimized and specialized kernels. Fortunately, Nx and EXLA will take care of generating these kernels for us (by delegating them to another compiler). We can focus on our high-level implementation, and not the low-level details.
Axon 是一个用于为 Elixir 编程语言创建神经网络的库。该库完全构建在 Nx 之上,这意味着它可以与 EXLA 等编译器结合使用,以通过“即时”(JIT) 编译将程序加速到 CPU、GPU 或 TPU。这到底是什么意思?有人替我们解决了辛苦的工作!为了利用我们的硬件,我们需要优化和专门的内核。幸运的是, NxEXLA 会负责为我们生成这些内核(通过将它们委托给另一个编译器)。我们可以专注于我们的高级实现,而不是低级细节。

You don’t need to understand the intricacies of GPU programming or optimized mathematical routines to train real and practical neural networks.
您无需了解 GPU 编程的复杂性或优化的数学例程即可训练真实实用的神经网络。

What is a neural network?
什么是神经网络?

A neural network is really just a function which maps inputs to outputs:
神经网络实际上只是一个将输入映射到输出的函数:

  • Pictures of cats and dogs -> Label cat or dog
    猫狗图片->标签 catdog
  • Lot size, square footage, # of bedrooms, # of bathrooms -> Housing Price
    地块面积、平方英尺、卧室数量、浴室数量 -> 房价
  • Movie Review -> positive or negative rating
    电影评论 -> positivenegative 评分

The “magic” is what happens during the transformation of input data to output label. Imagine a cohesive team of engineers solving problems. Each engineer brings their own unique perspective to a problem, applies their expertise, and their efforts are coordinated with the group in a meaningful way to deliver an excellent product. This coordinated effort is analogous to the coordinated effort of layers in a neural network. Each layer learns it’s own representation of the input data, which is then given to the next layer, and the next layer, and so on until we’re left with a meaningful representation:
“魔法”是在将输入数据转换为输出标签的过程中发生的事情。想象一个有凝聚力的工程师团队解决问题。每个工程师都对问题提出自己独特的观点,运用他们的专业知识,并且他们的努力以有意义的方式与团队协调,以交付出色的产品。这种协调努力类似于神经网络中各层的协调努力。每一层都学习它自己对输入数据的表示,然后将其提供给下一层,再下一层,依此类推,直到我们得到有意义的表示:

In the diagram above, you’ll notice that information flows forward. Occasionally, you’ll hear the term feed-forward networks which is derived from the fact that information flows forward in a neural network.
在上图中,您会注意到信息是向前流动的。有时,您会听到术语前馈网络,它源于信息在神经网络中向前流动的事实。

Successive transformations in a neural network are typically referred to as layers. Mathematically, a layer is just a function:
神经网络中的连续变换通常称为层。在数学上,层只是一个函数:

$f(x; \theta) = f^{(1)}(f^{(2)}(x; \theta); \theta})$

Where function 1 and function 2 are layers. For those who like to read code more than equations, the transformations essentially boil down to the following Elixir code:
其中函数 1 和函数 2 是层。对于那些喜欢阅读代码而不是方程式的人来说,这些转换基本上可以归结为以下 Elixir 代码:

def f(x, parameters) do
  x
  |> f_2(parameters)
  |> f_1(parameters)
end
def f_1(x, parameters) do
  {w1, b1, _, _} = parameters
  x * w1 + b1
end
def f_2(x, parameters) do
  {_, _, w2, b2} = parameters
  x * w2 + b2
end

        
          
        
      

In the diagram, you’ll also notice the term activation function. Activation functions are nonlinear element-wise functions which scale layer outputs. You can think of them as “activating” or highlighting important information as it propagates through the network. With activation functions, our simple two layer neural network starts to look something like:
在图中,您还会注意到术语激活函数。激活函数是缩放层输出的非线性逐元素函数。您可以将它们视为“激活”或突出显示通过网络传播的重要信息。使用激活函数,我们简单的两层神经网络开始看起来像这样:

def f(x, parameters) do
  x
  |> f_2(parameters)
  |> activation_2()
  |> f_1(parameters)
  |> activation_1()
end
def f_1(x, parameters) do
  {w1, b1, _, _} = parameters
  x * w1 + b1
end
def activation_1(x) do
  sigmoid(x)
end
def f_2(x, parameters) do
  {_, _, w2, b2} = parameters
  x * w2 + b2
end
def activation_2(x) do
  sigmoid(x)
end

        
          
        
      

The “learning” in “deep learning” comes in learning parameters defined in the above functions which effectively solve a given task. As noted in the beginning of this post, “solve” is a relative term. Neural networks trained on separate tasks will have entirely different success criteria.
“深度学习”中的“学习”来自学习上述函数中定义的 parameters ,有效解决给定任务。正如本文开头所述,“解决”是一个相对术语。在不同任务上训练的神经网络将有完全不同的成功标准。

The learning process is commonly referred to as training. Neural networks are typically trained using gradient descent. Gradient descent optimizes the parameters of a neural network to minimize a loss function. A loss function is essentially the success criteria you define for your problem.
学习过程通常称为训练。神经网络通常使用梯度下降进行训练。梯度下降优化神经网络的参数以最小化损失函数。损失函数本质上是您为问题定义的成功标准。

Given a task, we can essentially boil down the process of creating and training neural networks to:
给定一个任务,我们基本上可以将创建和训练神经网络的过程归结为:

  1. Gather, explore, normalize the data 收集、探索、规范化数据
  2. Define the model 定义模型
  3. Define success criteria (loss function) 定义成功标准(损失函数)
  4. Define the training process (optimizer) 定义训练过程(优化器)
  5. Instrument with metrics, logging, etc. 具有指标、日志记录等的仪器
  6. Go! 干!

Axon makes steps 2-6 quick and easy – so much so that most of your time should be spent on step 1 with the data. For the rest of this post, we’ll walk through an example workflow in Axon, and see how easy it is to create and train a neural network from scratch.
Axon 使步骤 2-6 变得快速和简单 —— 以至于您的大部分时间应该花在处理数据的步骤 1 上。对于本文的其余部分,我们将通过 Axon 中的示例工作流程,看看从头开始创建和训练神经网络是多么容易。

Requirements 要求

To start, we’ll need to install some prerequisites. For this example, we’ll use Axon, Nx, and EXLA to take care of our data processing and neural network training. We’ll use Flow for creating a simple IO input pipeline. Pixel will help us decode our raw JPEGs and PNGs to tensors. Finally, Kino will allow us to render some of our data for analysis.
首先,我们需要安装一些先决条件。对于此示例,我们将使用 AxonNxEXLA 来处理我们的数据处理和神经网络训练。我们将使用 Flow 创建一个简单的 IO 输入管道。 Pixel 将帮助我们将原始 JPEG 和 PNG 解码为张量。最后, Kino 将允许我们呈现一些数据以供分析。

Additionally, we’ll set the default Defn compiler to EXLA. This will ensure all of our functions in Axon run using the XLA compiler. This really just means they’ll run much faster than they would in pure Elixir.
此外,我们将默认的 Defn 编译器设置为 EXLA 。这将确保我们在 Axon 中的所有功能都使用 XLA 编译器运行。这实际上只是意味着它们将比纯 Elixir 中的运行速度快得多。

Mix.install([
  {:axon, "~> 0.1.0-dev", github: "elixir-nx/axon"},
  {:exla, "~> 0.1.0-dev", github: "elixir-nx/nx", sparse: "exla"},
  {:nx, "~> 0.1.0-dev", github: "elixir-nx/nx", sparse: "nx", override: true},
  {:flow, "~> 1.1.0"},
  {:pixels, "~> 0.2.0"},
  {:kino, "~> 0.3.1"}
])

        
          
        
      
Nx.Defn.default_options(compiler: EXLA)

        
          
        
      

The Data 数据

Our goal is to differentiate between images of birds and not birds. While in a practical setting we’d probably want to include images of nature and other settings birds are found in for our negative examples, in this example we’ll use pictures of cats. Cats are definitely not birds.
我们的目标是区分鸟类图像和非鸟类图像。虽然在实际环境中,我们可能希望包括自然图像和鸟类在我们的负面示例中发现的其他设置,但在这个示例中,我们将使用猫的图片。猫绝对不是鸟。

For images of birds, we’ll use Caltech-UCSD Birds 2011 which is an open-source dataset consisting of around 11k images of various birds. For images of cats, we’ll use Cats vs. Dogs which is a dataset consisting of around 25k images of cats and dogs. The rest of this post will assume this data is downloaded locally.
对于鸟类图像,我们将使用 Caltech-UCSD Birds 2011,这是一个开源数据集,包含大约 11k 种鸟类的图像。对于猫的图像,我们将使用 Cats vs. Dogs,这是一个包含大约 25k 猫和狗图像的数据集。这篇文章的其余部分将假定此数据是在本地下载的。

Let’s start by getting an idea of what we’re working with:
让我们首先了解我们正在使用的是什么:

cats = "PetImages/Cat/*.jpg"
birds = "CUB_200_2011/images/*/*.jpg"
num_cats =
  cats
  |> Path.wildcard()
  |> Enum.count()
num_birds =
  birds
  |> Path.wildcard()
  |> Enum.count()
IO.write("Number of cats: #{num_cats}, Number of birds: #{num_birds}")

        
          
        
      

Fortunately, our dataset is relatively balanced. In total we have around 25000 images. This is a little on the low side for most practical deep learning problems. Both data quantity and data quality have a large impact on the performance of your neural networks. For our example here the data will suffice, but for practical purposes you’d want to conduct a full data exploration and analysis before diving in.
幸运的是,我们的数据集相对平衡。我们总共有大约 25000 张图像。对于大多数实际的深度学习问题来说,这有点偏低。数据量和数据质量都会对神经网络的性能产生很大影响。对于我们此处的示例,数据就足够了,但出于实际目的,您需要在深入研究之前进行完整的数据探索和分析。

We can use Kino to get an idea of what examples in our dataset look like:
我们可以使用 Kino 来了解数据集中的示例是什么样的:

cats
|> Path.wildcard()
|> Enum.random()
|> File.read!()
|> Kino.Image.new("image/jpeg")

        
          
        
      
birds
|> Path.wildcard()
|> Enum.random()
|> File.read!()
|> Kino.Image.new("image/jpeg")

        
          
        
      

One thing you might notice is that our images are not normalized in terms of height and width. Axon requires all images to have the same height, width, and number of color channels. In order to train and run our neural network, we’ll need to process each image into the same dimensions.
您可能会注意到的一件事是我们的图像在高度和宽度方面没有标准化。 Axon 要求所有图像具有相同的高度、宽度和颜色通道数。为了训练和运行我们的神经网络,我们需要将每张图像处理成相同的维度。

Additionally, our images are encoded as PNGs and JPEGs. Axon only works with tensors, so we’ll need to read each image into a tensor before we can use it. We can do this using Pixel and a sprinkle of Nx. First, let’s see how we can go from image to tensor:
此外,我们的图像被编码为 PNG 和 JPEG。 Axon 仅适用于张量,因此我们需要先将每张图像读入张量,然后才能使用它。我们可以使用 Pixel 和一些 Nx 来做到这一点。首先,让我们看看如何从图像到张量:

{:ok, image} =
  cats
  |> Path.wildcard()
  |> Enum.random()
  |> Pixels.read_file()
%{data: data, height: height, width: width} = image
data
|> Nx.from_binary({:u, 8})
|> Nx.reshape({4, height, width}, names: [:channels, :height, :width])

        
          
        
      

Nx encodes images as values at each pixel. By default, Pixels decodes images in RGBA format. So, for each pixel in an image with shape {height, width}, we have 4 8-bit integer values: red, green, blue, and alpha (opacity). Pixels conveniently gives us a binary of pixel data, and the height and width of the image. So we can create a tensor using Nx.from_binary/2 and then reshape to the correct input shape using Nx.reshape/2.
Nx 将图像编码为每个像素的值。默认情况下, Pixels 解码 RGBA 格式的图像。因此,对于形状为 {height, width} 的图像中的每个像素,我们有 4 个 8 位整数值: redgreenbluealpha (不透明度)。 Pixels 方便地给了我们像素数据的二进制,以及图像的高度和宽度。所以我们可以使用 Nx.from_binary/2 创建张量,然后使用 Nx.reshape/2 重塑为正确的输入形状。

When working with images, it’s common to normalize pixel values to fall between 0 and 1. This helps stabilize the training of neural networks (most parameters are initialized to a value between 0 and 0.06). To do this, we can simply divide our image by 255:
在处理图像时,通常会将像素值标准化为介于 01 之间。这有助于稳定神经网络的训练(大多数参数初始化为 00.06 之间的值)。为此,我们可以简单地将图像除以 255

data
|> Nx.from_binary({:u, 8})
|> Nx.reshape({4, height, width}, names: [:channels, :height, :width])
|> Nx.divide(255.0)

        
          
        
      

Now, let’s take some of this exploration and turn it into a legitimate input pipeline.
现在,让我们进行一些探索并将其转变为合法的输入管道。

Input Pipeline 输入管道

Now that we know how to get a tensor from an image, we can go about constructing the input pipeline. In this example, our pipeline will just be an Elixir Stream. In most machine learning applications, datasets will be too large to load entirely into memory. Instead, we want to construct an efficient pipeline of data preprocessing and normalization which runs in parallel with model training. For example, we can train our models entirely on the GPU, and process new data at the same time on the CPU.
现在我们知道如何从图像中获取张量,我们可以着手构建输入管道。在此示例中,我们的管道将只是一个 Elixir Stream 。在大多数机器学习应用程序中,数据集太大而无法完全加载到内存中。相反,我们希望构建一个与模型训练并行运行的高效数据预处理和规范化管道。例如,我们可以完全在 GPU 上训练我们的模型,同时在 CPU 上处理新数据。

Right now, we can retrieve our data as paths to separate image directories. We’ll start by labeling images in respective directories, shuffling the input data, and then splitting it into train, validation, and test sets:
现在,我们可以将数据检索为单独图像目录的路径。我们将从在各自目录中标记图像开始,打乱输入数据,然后将其拆分为训练集、验证集和测试集:

cats_path_and_label =
  cats
  |> Path.wildcard()
  |> Enum.map(&{&1, 0})
birds_path_and_label =
  birds
  |> Path.wildcard()
  |> Enum.map(&{&1, 1})

        
          
        
      
image_path_and_label = cats_path_and_label ++ birds_path_and_label
num_examples = Enum.count(image_path_and_label)
num_train = floor(0.8 * num_examples)
num_val = floor(0.2 * num_train)
{train, test} =
  image_path_and_label
  |> Enum.shuffle()
  |> Enum.split(num_train)
{val, train} =
  train
  |> Enum.split(num_val)

        
          
        
      

This sort of dataset division is common when training neural networks. Each separate dataset serves a separate purpose:
这种数据集划分在训练神经网络时很常见。每个单独的数据集都有一个单独的目的:

  • Train set - consists of examples that the network explicitly trains on. This should be the largest portion of your dataset. Typically 70-90% depending on dataset size.
    训练集 - 由网络明确训练的示例组成。这应该是数据集的最大部分。通常为 70-90%,具体取决于数据集大小。
  • Validation set - consists of examples that are used to evaluate the model during training. They provide a means of monitoring the model for overfitting as the examples in the validation set are not explicitly trained on. Typically is a small percentage of the train set.
    验证集 - 包含用于在训练期间评估模型的示例。它们提供了一种监控模型过度拟合的方法,因为验证集中的示例未明确接受过训练。通常是火车集的一小部分。
  • Test set - consists of examples which are unseen during training and validation which are used to validate the trained model’s performance.
    测试集 - 由训练和验证期间未见的示例组成,用于验证训练模型的性能。

As you’ll see, Axon makes it easy to create training and evaluation pipelines which make use of all of these datasets.
正如您将看到的,Axon 可以轻松创建使用所有这些数据集的训练和评估管道。

Next, we’ll create a function which returns a stream given a list of image paths and labels. Our stream should:
接下来,我们将创建一个函数,它返回一个给定图像路径和标签列表的流。我们的流应该:

  1. Parse the given image path into a tensor or filter bad images
    将给定的图像路径解析为张量或过滤不良图像
  2. Pad or crop image to a fixed size
    将图像填充或裁剪为固定大小
  3. Rescale the image pixel values between 0 and 1
    在 0 和 1 之间重新缩放图像像素值
  4. Batch input images
    批量输入图片

A batch is just a collection of training examples. In theory, we’d want to update a neural network’s parameters using the gradient of each parameter with respect to the model’s loss over the entire training dataset. In reality, most datasets are far too large for this. Instead, we update models incrementally on batches of training data. One full pass of batches through the entire dataset is called an epoch.
批次只是训练示例的集合。理论上,我们希望使用每个参数相对于模型在整个训练数据集上的损失的梯度来更新神经网络的参数。实际上,大多数数据集对于这个来说都太大了。相反,我们在批量训练数据上逐步更新模型。通过整个数据集的一次完整批次称为一个纪元。

In this example, we’ll batch images into batches of 32. The choice of batch size is arbitrary; however, it’s common to use batch sizes which are multiples of 32, e.g. 32, 64, 128, etc.
在这个例子中,我们将图像分成 32 个批次。批量大小的选择是任意的;然而,通常使用 32 的倍数的批量大小,例如32、64、128 等

max_height = 32
max_width = 32
batch_size = 32
resize_dimension = fn tensor, dim, limit ->
  axis_size = Nx.axis_size(tensor, dim)
  cond do
    axis_size == limit ->
      tensor
    axis_size < limit ->
      pad_val = 0
      pad_top = :rand.uniform(limit - axis_size)
      pad_bottom = limit - (axis_size + pad_top)
      rank = Nx.rank(tensor) - 1
      pads = for i <- 0..rank, do: if(i == dim, do: {0, 0, 0}, else: {pad_top, pad_bottom, 0})
      Nx.pad(tensor, pad_val, pads)
    :otherwise ->
      slice_start = :rand.uniform(axis_size - limit)
      slice_length = limit
      Nx.slice_axis(tensor, slice_start, slice_length, dim)
  end
end
resize_and_rescale = fn image ->
  image
  |> resize_dimension.(:height, max_height)
  |> resize_dimension.(:width, max_width)
  |> Nx.divide(255.0)
end
pipeline = fn paths ->
  paths
  |> Flow.from_enumerable()
  |> Flow.flat_map(fn {path, label} ->
    case Pixels.read_file(path) do
      {:error, _} ->
        [:error]
      {:ok, image} ->
        %{data: data, height: height, width: width} = image
        tensor =
          data
          |> Nx.from_binary({:u, 8})
          |> Nx.reshape({4, height, width}, names: [:channels, :height, :width])
        [{tensor, label}]
    end
  end)
  |> Stream.reject(fn
    :error -> true
    _ -> false
  end)
  |> Stream.map(fn {img, label} ->
    {Nx.Defn.jit(resize_and_rescale, [img]), label}
  end)
  |> Stream.chunk_every(32, 32, :discard)
  |> Stream.map(fn imgs_and_labels ->
    {imgs, labels} = Enum.unzip(imgs_and_labels)
    {Nx.stack(imgs), Nx.new_axis(Nx.stack(labels), -1)}
  end)
end

        
          
        
      

Let’s break down this input pipeline a little more. First, we create a function which resizes input images to have a max height and max width of 32. You can make your images larger, but this will consume more memory and make the training process a little bit slower. You might see a slight boost in final accuracy as the image retains more of the original image’s information. Our random crop or pad function is actually pretty bad in terms of performance. This is because libraries such as XLA do really poorly with dynamic input shapes. A better solution would be to make use of dedicated image manipulation routines such as those in OpenCV. For this example, our solution will suffice.
让我们再分解一下这个输入管道。首先,我们创建一个函数来调整输入图像的大小,使其最大高度和最大宽度为 32 。您可以使图像更大,但这会消耗更多内存并使训练过程稍微慢一些。您可能会看到最终精度略有提高,因为图像保留了更多原始图像的信息。我们的随机裁剪或填充功能实际上在性能方面非常糟糕。这是因为像 XLA 这样的库在动态输入形状方面做得很差。更好的解决方案是使用专用的图像处理程序,例如 OpenCV 中的程序。对于这个例子,我们的解决方案就足够了。

Next, we define our pipeline using Flow. Flow will apply our image reading and decoding routine concurrently to our list of input paths. Pixel takes care of the work of actually decoding our images into binary data. Unfortunately, some of the images in our dataset are corrupted. Thus, we need to mark these with :error and throw them out before we attempt to train with them.
接下来,我们使用 Flow 定义管道。 Flow 会将我们的图像读取和解码例程同时应用于我们的输入路径列表。 Pixel 负责将我们的图像实际解码为二进制数据的工作。不幸的是,我们数据集中的一些图像已损坏。因此,我们需要用 :error 标记它们并在我们尝试用它们训练之前将它们扔掉。

Next we apply our resizing method with Nx.Defn.jit. Nx.Defn.jit uses the default compiler options to explicitly JIT compile a function. Typically, we’d define the functions we want to accelerate within a module as defn; however, we can also define anonymous functions and explicitly JIT compile them this way. After we have our images as tensors, we group adjacent examples into groups of 32 and “stack” them on top of each other. Our final stream will return tensors of shapes {32, 4, 32, 32} and {32, 1} in a lazy manner. This will ensure we don’t load every image into memory at once, but instead load them as we need them. We can use our pipeline function to create pipelines from the splits we defined previously:
接下来我们使用 Nx.Defn.jit 应用调整大小的方法。 Nx.Defn.jit 使用默认编译器选项显式 JIT 编译一个函数。通常,我们会将模块中要加速的功能定义为 defn ;然而,我们也可以定义匿名函数并以这种方式显式地进行 JIT 编译。在我们将图像作为张量后,我们将相邻的示例分成 32 个一组,并将它们“堆叠”在彼此之上。我们的最终流将以惰性方式返回形状为 {32, 4, 32, 32}{32, 1} 的张量。这将确保我们不会一次将每个图像加载到内存中,而是在需要时加载它们。我们可以使用我们的 pipeline 函数从我们之前定义的拆分创建管道:

train_data = pipeline.(train)
val_data = pipeline.(val)
test_data = pipeline.(test)

        
          
        
      

With our pipelines created, it’s time to create our model!
创建我们的管道后,是时候创建我们的模型了!

The Model 该模型

Before we can train a model, we need a model to train! Axon makes the process of creating neural networks easy with its model creation API. Axon defines the layers of a neural network as composable functions. Each function returns an Axon struct which retains information about the model for use during initialization and prediction. The model we’ll define here is known as a convolutional neural network. It’s a special kind of neural network used mostly in computer vision tasks.
在我们训练模型之前,我们需要一个模型来训练! Axon 通过其模型创建 API 简化了创建神经网络的过程。 Axon 将神经网络的层定义为可组合函数。每个函数都返回一个 Axon 结构,它保留有关模型的信息以供在初始化和预测期间使用。我们将在此处定义的模型称为卷积神经网络。它是一种特殊的神经网络,主要用于计算机视觉任务。

All Axon models start with an explicit input definition. This is necessary because successive layer parameters depend specifically on the input shape. You are allowed to define one dimension as nil, representing a variable batch size. Our images are in batches of 32 with 4 color channels in a 32x32 image. Thus, our input shape is {nil, 4, 32, 32}. Following the input definition, you define each successive layer. You can essentially read the model from the top down as a series of transformations.
所有 Axon 模型都以显式输入定义开头。这是必要的,因为连续的层参数具体取决于输入形状。您可以将一个维度定义为 nil ,表示可变的批量大小。我们的图像以 32 x 32 图像的 4 个颜色通道为一组,每批 32 个。因此,我们的输入形状是 {nil, 4, 32, 32} 。在输入定义之后,您定义每个连续的层。您基本上可以将模型作为一系列转换从上到下阅读。

model =
  Axon.input({nil, 4, 32, 32})
  |> Axon.conv(32, kernel_size: {3, 3})
  |> Axon.batch_norm()
  |> Axon.relu()
  |> Axon.max_pool(kernel_size: {2, 2})
  |> Axon.conv(64, strides: [2, 2])
  |> Axon.batch_norm()
  |> Axon.relu()
  |> Axon.max_pool(kernel_size: {2, 2})
  |> Axon.conv(32, kernel_size: {3, 3})
  |> Axon.batch_norm()
  |> Axon.relu()
  |> Axon.global_avg_pool()
  |> Axon.dense(1, activation: :sigmoid)

        
          
        
      

Notice how Axon gives us a nice table which shows how each layer transforms the model input, as well as the number of parameters in each layer and in the model as a whole. This is a high-level summary of the model, and can be useful for debugging intermediate shape issues and for determining the size of a given model.
请注意 Axon 是如何为我们提供一个漂亮的表格的,该表格显示了每一层如何转换模型输入,以及每一层和整个模型中的参数数量。这是模型的高级摘要,可用于调试中间形状问题和确定给定模型的大小。

For this post, we’ll gloss over the details of what each layer does and how it helps the neural network learn good representations of the input data. However, one thing that is important to note is the final sigmoid layer. Our problem is a binary classification problem. That means we want to classify images in one of two classes: bird or not bird. Because of this, we want our neural network to predict a probability between 0 and 1. Probabilities closer to 1 indicate a higher confidence that an example is a bird. Probabilities closer to 0 represent a lower confidence in an example being a bird. sigmoid is a function which always returns a value between 0 and 1. Thus, it will return the probability we’re looking for.
对于这篇文章,我们将略过每一层的作用的细节,以及它如何帮助神经网络学习输入数据的良好表示。但是,需要注意的重要一件事是最后的 sigmoid 层。我们的问题是二元分类问题。这意味着我们要将图像分类为两个类别之一: birdnot bird 。因此,我们希望我们的神经网络预测 01 之间的概率。接近 1 的概率表示样本是鸟的置信度更高。接近 0 的概率表示对鸟的例子的信心较低。 sigmoid 是一个始终返回 01 之间的值的函数。因此,它将返回我们正在寻找的概率。

Training Day 训练日

Now that we’ve defined the network, it’s time to define the training process! Axon abstracts the training and evaluation process into a unified Loop API. Training and evaluation are really just loops which carry state over some dataset. Axon takes away as much of the boilerplate of writing these loops away as possible.
现在我们已经定义了网络,是时候定义训练过程了! Axon 将训练和评估过程抽象为统一的 Loop API。训练和评估实际上只是在某些数据集上传递状态的循环。 Axon 尽可能多地去除编写这些循环的样板。

In order to define a training loop, we start from the Axon.Loop.trainer/4 factory method. This creates a Loop struct with some pre-populated fields specific to model training. Axon.Loop.trainer/4 takes four parameters:
为了定义训练循环,我们从 Axon.Loop.trainer/4 工厂方法开始。这将创建一个 Loop 结构,其中包含一些特定于模型训练的预填充字段。 Axon.Loop.trainer/4 有四个参数:

  1. The model - this is the model we want to train
    模型 —— 这是我们要训练的模型
  2. The loss - this is our training objective
    损失 —— 这是我们的训练目标
  3. The optimizer - this is how we will train
    优化器 —— 这就是我们训练的方式
  4. Options - miscellaneous options
    选项 - 杂项选项

We’ve already defined our model. In this example, we’ll use the binary_cross_entropy loss function. This is the loss function you’ll want to use with most binary classification tasks. Our optimizer is the adam optimizer. Adam is a variant of gradient descent which works pretty well for most tasks. Finally, we specify the log option to tell our trainer to log training output on every iteration.
我们已经定义了我们的模型。在这个例子中,我们将使用 binary_cross_entropy 损失函数。这是您希望在大多数二元分类任务中使用的损失函数。我们的优化器是 adam 优化器。 Adam 是梯度下降的一种变体,适用于大多数任务。最后,我们指定 log 选项来告诉我们的训练器在每次迭代时记录训练输出。

After creating a loop, it’s necessary to instrument it with metrics and handlers. metrics are anything you want to track during training. For example, we want to keep track of our model’s accuracy during training. Accuracy is a bit more readily interpretable than loss, so this will help us ensure that our model is actually training. handlers take place on specific events. For example, logging is actually implemented as a handler which runs after each batch. In this example, we’ll call the validate handler which will run a validation loop at the end of each epoch. Our validation loop will let us know if our model is overfitting on the training data.
创建循环后,有必要使用 metricshandlers 对其进行检测。 metrics 是您在训练期间想要跟踪的任何内容。例如,我们想在训练期间跟踪模型的准确性。准确性比损失更容易解释,因此这将帮助我们确保我们的模型确实在训练。 handlers 发生在特定事件上。例如,日志记录实际上是作为在每个批次之后运行的处理程序来实现的。在此示例中,我们将调用 validate 处理程序,它将在每个纪元结束时运行验证循环。我们的验证循环会让我们知道我们的模型是否对训练数据过度拟合。

Finally, after creating and instrumenting our loop, we need to run it. Axon.Loop.run/3 takes the actual loop we want to run, the input data we want to loop over, and some loop-specific options. In this example, we’ll have our loop run for a total of 5 epochs. That means we will run our loop a total of 5 full times through the training data (note this will take upwards of 20 minutes to complete depending on the capabilities of your machine):
最后,在创建并检测我们的循环之后,我们需要运行它。 Axon.Loop.run/3 采用我们要运行的实际循环、我们要循环的输入数据以及一些特定于循环的选项。在这个例子中,我们将循环运行总共 5 个时期。这意味着我们将通过训练数据运行我们的循环总共 5 次(请注意,这将需要 20 分钟以上才能完成,具体取决于您机器的功能):

model_state =
  model
  |> Axon.Loop.trainer(:binary_cross_entropy, :adam, log: 1)
  |> Axon.Loop.metric(:accuracy)
  |> Axon.Loop.validate(model, val_data)
  |> Axon.Loop.run(train_data, epochs: 5)

        
          
        
      

Notice how our model incrementally improves epoch over epoch. The output of our training loop is the trained model state. We can use this to evaluate our model on our test set. In a practical setting, you’d want to save this state for use in production.
请注意我们的模型是如何逐步改进一个时代的。我们训练循环的输出是训练后的模型状态。我们可以使用它来评估我们的测试集上的模型。在实际设置中,您希望保存此状态以用于生产。

Did we need a research team and 5 years?
我们需要一个研究团队和 5 年吗?

The process of writing an evaluation loop is very similar to the process of writing a training loop in Axon. We start from a factory, in this case Axon.Loop.evaluator/2. This function takes the model we want to evaluate, and the trained model state.
编写评估循环的过程与在 Axon 中编写训练循环的过程非常相似。我们从工厂开始,在本例中为 Axon.Loop.evaluator/2 。此函数采用我们要评估的模型和经过训练的模型状态。

Next, we instrument again with metrics and handlers.
接下来,我们再次使用指标和处理程序进行检测。

Finally, we run – this time on our test set:
最后,我们运行 —— 这次是在我们的测试集上:

model
|> Axon.Loop.evaluator(model_state)
|> Axon.Loop.metric(:accuracy)
|> Axon.Loop.run(test_data)

        
          
        
      

Our model finished with 78% accuracy, which means we were able to differentiate between birds and not birds about 78% of the time. Considering this took us under an hour to do, I would say that’s pretty incredible progress!
我们的模型以 78% 的准确率结束,这意味着我们能够在大约 78% 的时间内区分鸟类和非鸟类。考虑到这花了我们不到一个小时的时间,我想说这是一个非常令人难以置信的进步!

Conclusion 结论

While this post glossed over many of the very specific details of how neural networks work, I hope it demonstrated the power of neural networks to perform well on what we once perceived to be very challenging machine learning problems. Additionally, I hope this post inspired you to take a deeper look into the field of deep learning and specifically into Axon.
虽然这篇文章掩盖了神经网络如何工作的许多非常具体的细节,但我希望它展示了神经网络在我们曾经认为非常具有挑战性的机器学习问题上表现出色的能力。此外,我希望这篇文章能启发您更深入地了解深度学习领域,尤其是 Axon。

In future posts, we’ll take a much closer look at the details and math underpinning neural networks and training neural networks, at how Axon makes use of Nx under the hood, and at some more specific problems that you can use Axon to solve. If you’re interested in learning more about Axon or Nx, be sure to check out the Elixir Nx Organization, come chat with us in the EEF ML Working Group Slack, or ask questions on the Nx Elixir Forum.
在以后的帖子中,我们将更仔细地研究支持神经网络和训练神经网络的细节和数学, Axon 如何在幕后使用 Nx ,以及您可以使用的一些更具体的问题轴突解决。如果您有兴趣了解有关 AxonNx 的更多信息,请务必查看 Elixir Nx 组织,在 EEF ML 工作组 Slack 中与我们聊天,或在 Nx Elixir 论坛上提问。

Until next time! 直到下一次!

Up and Running Nx
启动并运行 Nx

Photograph titled "Dancing Roof" of a wavy grid-like roof pattern

Machine Learning Advisor
机器学习顾问

Sean Moriarity  肖恩·莫里亚蒂

This is the first blog post in a series by guest writer Sean Moriarity, co-author of the Elixir Nx library and author of the book “Genetic Algorithms in Elixir”.
这是客座作家 Sean Moriarity 的系列博文中的第一篇博文,Sean Moriarity 是 Elixir Nx 库的合著者,也是《Elixir 中的遗传算法》一书的作者。

Nx is a new library for tensor manipulation and numerical computing on the BEAM. Nx hopes to open doors for Elixir, Erlang, and other BEAM languages to new, exciting domains by allowing users to accelerate code through JIT compilation and providing interfaces to highly-specialized tensor manipulation routines. In this post, you will learn some of the basics needed to get started with Nx, and you’ll see a basic example of how Nx can be used for machine learning applications.
Nx 是用于在 BEAM 上进行张量操作和数值计算的新库。 Nx 希望通过允许用户通过 JIT 编译加速代码并为高度专业化的张量操作例程提供接口,为 Elixir、Erlang 和其他 BEAM 语言打开通往新的、令人兴奋的领域的大门。在这篇文章中,您将学习一些开始使用 Nx 所需的基础知识,并且您将看到一个基本示例,说明如何将 Nx 用于机器学习应用程序。

Getting Comfortable with Tensors 熟悉张量

The Nx definition of “tensor” is similar to the PyTorch or TensorFlow tensor, or the NumPy multidimensional array. If you’re coming from one of those frameworks, manipulating Nx tensors should feel familiar to you. One thing to note - the Nx definition of a tensor is not necessarily the same as the pure math definition of a tensor. Nx follows most of the conventions and precedents put forth by the Python ecosystem, so transitioning from any of those frameworks should be relatively easy. For Elixir programmers, it’s easy to think of tensors as nested lists, with some additional metadata:
“张量”的 Nx 定义类似于 PyTorch 或 TensorFlow 张量,或 NumPy 多维数组。如果您来自其中一个框架,那么操作 Nx 张量对您来说应该很熟悉。需要注意的一件事 - 张量的 Nx 定义不一定与张量的纯数学定义相同。 Nx 遵循 Python 生态系统提出的大部分约定和先例,因此从任何这些框架进行转换都应该相对容易。对于 Elixir 程序员来说,很容易将张量视为带有一些额外元数据的嵌套列表:

iex> Nx.tensor([[1, 2, 3], [4, 5, 6]])
#Nx.Tensor<
  s64[2][3]
  [
    [1, 2, 3],
    [4, 5, 6]
  ]
>

        
          
        
      

Nx.tensor/2 is one method you can use to create a tensor. It works with both nested lists of numbers and scalars:
Nx.tensor/2 是一种可用于创建张量的方法。它适用于数字和标量的嵌套列表:

iex> Nx.tensor(1.0)
#Nx.Tensor<
  f32
  1.0
>

        
          
        
      

Notice the additional metadata that comes out when tensors are inspected, namely s64[2][3] and f32 in the examples above. Tensors have both shapes and types associated with them. A tensor’s shape is a tuple representing the size of each dimension in the tensor. In the examples above, the first tensor has a shape of {2, 3} as represented by [2][3] in the inspected tensor:
请注意检查张量时出现的其他元数据,即上面示例中的 s64[2][3]f32 。张量具有与之相关的形状和类型。张量的形状是一个元组,表示张量中每个维度的大小。在上面的示例中,第一个张量的形状为 {2, 3} ,在检查的张量中由 [2][3] 表示:

iex> Nx.shape(Nx.tensor([[1, 2, 3], [4, 5, 6]]))
{2, 3}

        
          
        
      

If you’re comfortable thinking of tensors as nested lists, this should make some intuitive sense - the first example contains 2 lists of 3 elements each. If you were to wrap the first example in more lists, the shape would change accordingly:
如果您愿意将张量视为嵌套列表,那么这应该具有一定的直观意义 —— 第一个示例包含 2 个列表,每个列表包含 3 个元素。如果您要将第一个示例包装在更多列表中,形状将相应改变:

iex> Nx.shape(Nx.tensor([[[[1, 2, 3], [4, 5, 6]]]]))
{1, 1, 2, 3}

        
          
        
      

1 list of 1 list of 2 lists of 3 elements
1 个列表 1 个列表 2 个列表 3 个元素

This line of thinking can be a bit confusing when working with scalars. The shape of a scalar tensor is represented by an empty tuple:
在使用标量时,这种思路可能会有点混乱。标量张量的形状由一个空元组表示:

iex> Nx.shape(Nx.tensor(1.0))
{}

        
          
        
      

This is because scalars are actually 0-dimensional tensors. They don’t have any dimensions and therefore, they have an “empty” shape.
这是因为标量实际上是 0 维张量。它们没有任何尺寸,因此具有“空”的形状。

A tensor’s type is the numeric type associated with the tensor. Types in Nx are represented as 2 element tuples with a type-class and size or bitwidth:
张量的类型是与张量关联的数值类型。 Nx 中的类型表示为具有类型类和大小或位宽的 2 个元素元组:

iex> Nx.type(Nx.tensor([[1, 2, 3], [4, 5, 6]]))
{:s, 64}
iex> Nx.type(Nx.tensor(1.0))
{:f, 32}

        
          
        
      

Types are important because they tell Nx how to store tensors internally. Nx tensors are internally represented as binaries:
类型很重要,因为它们告诉 Nx 如何在内部存储张量。 Nx 张量在内部表示为二进制文件:

iex> Nx.to_binary(Nx.tensor(1))
<<1, 0, 0, 0, 0, 0, 0, 0>>
iex> Nx.to_binary(Nx.tensor(1.0))
<<0, 0, 128, 63>>

        
          
        
      

A Note on Endianness: Nx uses the native endianness specification, so the endianness of the binary is resolved at load-time to match the endianness of your machine. If, for some reason, your project requires big or little endian regardless of the machine it’s on, please open an issue describing your use case.
关于字节序的注意事项: Nx 使用本机字节序规范,因此二进制文件的字节序在加载时解析以匹配您机器的字节序。如果出于某种原因,您的项目需要大端或小端,而不管它在什么机器上,请打开一个问题来描述您的用例。

Notice the internal binary representation changes with a floating-point versus a signed integer type. You should also notice that Nx will attempt to infer the input type; however, you can also specify the input type when creating tensors using the type option:
请注意,内部二进制表示随着浮点数与有符号整数类型的变化而变化。您还应该注意到 Nx 将尝试推断输入类型;但是,您还可以在使用 type 选项创建张量时指定输入类型:

iex> Nx.to_binary(Nx.tensor(1, type: {:f, 32}))
<<0, 0, 128, 63>>
iex> Nx.to_binary(Nx.tensor(1.0))
<<0, 0, 128, 63>>

        
          
        
      

Because Nx tensors are represented as binaries, you should almost never use Nx.tensor/2 in practice because it’s expensive for very large tensors. Nx exposes a useful method, Nx.from_binary/2 which does not require traversing a nested list:
因为 Nx 张量表示为二进制文件,所以在实践中你几乎不应该使用 Nx.tensor/2 ,因为它对于非常大的张量来说是昂贵的。 Nx 公开了一个有用的方法, Nx.from_binary/2 不需要遍历嵌套列表:

iex> Nx.from_binary(<<0, 0, 128, 63, 0, 0, 0, 64, 0, 0, 64, 64>>, {:f, 32})
#Nx.Tensor<
  f32[3]
  [1.0, 2.0, 3.0]
>

        
          
        
      

Nx.from_binary/2 takes a binary and a type and always returns a 1-dimensional tensor. If you want your data to have a different shape, you can use Nx.reshape/2:
Nx.from_binary/2 采用二进制和类型,并始终返回一维张量。如果你想让你的数据有不同的形状,你可以使用 Nx.reshape/2

iex> Nx.reshape(Nx.from_binary(<<0, 0, 128, 63, 0, 0, 0, 64, 0, 0, 64, 64>>, {:f, 32}), {3, 1})
#Nx.Tensor<
  f32[3][1]
  [
    [1.0],
    [2.0],
    [3.0]
  ]
>

        
          
        
      

Nx.reshape/2 only ever changes the shape attribute of the tensor, so it’s a relatively inexpensive operation. When your data comes in as a binary, using Nx.from_binary/2 with Nx.reshape/2 is the most efficient way to create tensors.
Nx.reshape/2 只会改变张量的形状属性,所以这是一个相对便宜的操作。当您的数据以二进制形式出现时,使用 Nx.from_binary/2Nx.reshape/2 是创建张量的最有效方法。

Working with Tensor Operations 使用张量运算

If you’re an experienced Elixir programmer, you’re probably intimately familiar with the Enum module for manipulating collections that implement the Enumerable protocol. Because of this, you’ll probably search for and prefer to use the functional constructs map and reduce. Nx does expose both map and reduce as methods for manipulating tensors, and they work in almost exactly the same way you’d expect; however, you should almost never use these methods.
如果您是一位经验丰富的 Elixir 程序员,您可能非常熟悉用于操作实现 Enumerable 协议的集合的 Enum 模块。因此,您可能会搜索并更喜欢使用函数式结构 mapreduceNx 确实公开了 mapreduce 作为操作张量的方法,并且它们的工作方式几乎与您期望的完全相同;但是,您几乎不应该使用这些方法。

All of the operations in the Nx library are tensor-aware, which means they work on tensors of any shape and type. For example, in Elixir you might be used to doing something like:
Nx 库中的所有操作都是张量感知的,这意味着它们适用于任何形状和类型的张量。例如,在 Elixir 中你可能习惯于做这样的事情:

iex> Enum.map([1, 2, 3], fn x -> :math.cos(x) end)
[0.5403023058681398, -0.4161468365471424, -0.9899924966004454]

        
          
        
      

But, you can achieve the same thing in Nx using just Nx.cos/1:
但是,您可以在 Nx 中仅使用 Nx.cos/1 实现同样的事情:

iex> Nx.cos(Nx.tensor([1, 2, 3]))
#Nx.Tensor<
  f32[3] 
  [0.5403022766113281, -0.416146844625473, -0.9899924993515015]
>

        
          
        
      

All of the unary operators in Nx work this way - they apply a function element-wise to a tensor of any type and any shape:
Nx 中的所有一元运算符都以这种方式工作 —— 它们将函数逐元素应用于任何类型和任何形状的张量:

iex> Nx.exp(Nx.tensor([[[1], [2], [3]]]))
#Nx.Tensor<
  f32[1][3][1]
  [
    [
      [2.7182817459106445],
      [7.389056205749512],
      [20.08553695678711]
    ]
  ]
>
iex> Nx.sin(Nx.tensor([[1, 2, 3]]))
#Nx.Tensor<
  f32[1][3]
  [
    [0.8414709568023682, 0.9092974066734314, 0.14112000167369843]
  ]
>
iex> Nx.acosh(Nx.tensor([1, 2, 3]))
#Nx.Tensor<
  f32[3] 
  [0.0, 1.316957950592041, 1.7627471685409546]
>

        
          
        
      

There’s almost never a need to use something like Nx.map, because the element-wise unary operators can almost always be used to achieve the same effect. Nx.map will almost always be less efficient, and you will be unable to use Nx transforms like grad with Nx.map. Additionally, Nx.map cannot be supported by some Nx backends or compilers - so portability is a concern. The same applies for working with aggregate methods like Nx.reduce. You should prefer the Nx provided aggregate methods like Nx.sum, Nx.mean, and Nx.product, over implementing your own using Nx.reduce:
几乎从来不需要使用像 Nx.map 这样的东西,因为元素方面的一元运算符几乎总是可以用来实现相同的效果。 Nx.map 几乎总是效率较低,并且您将无法使用 Nx 转换,如 gradNx.map 。此外,某些 Nx 后端或编译器不支持 Nx.map - 因此可移植性是一个问题。这同样适用于像 Nx.reduce 这样的聚合方法。你应该更喜欢 Nx 提供的聚合方法,如 Nx.sumNx.meanNx.product ,而不是使用 Nx.reduce 实现你自己的方法:

iex> Nx.sum(Nx.tensor([1, 2, 3]))
#Nx.Tensor<
  s64
  6
>
iex> Nx.product(Nx.tensor([1, 2, 3]))
#Nx.Tensor<
  s64
  6
>
iex> Nx.mean(Nx.tensor([1, 2, 3]))
#Nx.tensor<
  f32
  2.0
>

        
          
        
      

Nx aggregate methods also have the added benefit of being capable of reducing along a single axis. For example, if you have a collection of examples in a batch, you might only want to compute the mean for single examples:
Nx 聚合方法还具有能够沿单个轴减少的额外好处。例如,如果你有一批样本的集合,你可能只想计算单个样本的均值:

iex> Nx.mean(Nx.tensor([[1, 2, 3], [4, 5, 6]]), axes: [1])
#Nx.Tensor<
  f32[2] 
  [2.0, 5.0]
>

        
          
        
      

You can even provide multiple axes:
您甚至可以提供多个轴:

iex> Nx.mean(Nx.tensor([[[1, 2, 3], [4, 5, 6]]]), axes: [0, 1])
#Nx.Tensor<
  f32[3] 
  [2.5, 3.5, 4.5]
>

        
          
        
      

Nx also has binary operators that are tensor aware. Things like addition, subtraction, multiplication, and division work element-wise:
Nx 也有张量感知的二元运算符。诸如加法、减法、乘法和除法之类的东西是按元素计算的:

iex> Nx.add(Nx.tensor([1, 2, 3]), Nx.tensor([4, 5, 6]))
#Nx.Tensor<
  s64[3]
  [5, 7, 9]
>
iex> Nx.subtract(Nx.tensor([[1, 2, 3]]), Nx.tensor([[4, 5, 6]]))
#Nx.Tensor<
  s63[1][3]
  [-3, -3, -3]
>
iex> Nx.multiply(Nx.tensor([[1], [2], [3]]), Nx.tensor([[4], [5], [6]]))
#Nx.Tensor<
  s64[3][1]
  [
    [4],
    [10],
    [18]
  ]
>
iex> Nx.divide(Nx.tensor([1, 2, 3]), Nx.tensor([4, 5, 6]))
#Nx.Tensor<
  f32[3] 
  [0.25, 0.4000000059604645, 0.5]
>

        
          
        
      

With binary operators, however, there is an additional caveat: the tensor shapes must be compatible or capable of being broadcasted to the same shape. Broadcasting occurs when the input tensors have different shapes:
然而,对于二元运算符,还有一个额外的警告:张量形状必须兼容或能够广播到相同的形状。当输入张量具有不同的形状时会发生广播:

iex> Nx.add(Nx.tensor(1), Nx.tensor([1, 2, 3]))
#Nx.Tensor<
  s64[3]
  [2, 3, 4]
>

        
          
        
      

In the previous example, the scalar tensor 1 is broadcasted over the larger tensor. Broadcasting can be used to implement more memory-efficient routines by relaxing the need to work with tensors of the same shape. For example, if you need to multiply a 50x50x50 tensor by 2, you can use broadcasting to turn the operation into a loop which iterates over the 50x50x50 tensor, multiplying each element by 2, rather than creating another 50x50x50 tensor of all 2s.
在前面的示例中,标量张量 1 在较大的张量上广播。通过放宽使用相同形状的张量的需要,广播可用于实现更节省内存的例程。例如,如果您需要将 50x50x50 张量乘以 2,则可以使用广播将操作转变为一个循环,循环遍历 50x50x50 张量,将每个元素乘以 2,而不是创建另一个全为 2 的 50x50x50 张量。

In order for two tensors to be capable of broadcasting, each of their dimensions must be compatible. Dimensions are compatible if one of the following requirements is met:
为了使两个张量能够广播,它们的每个维度都必须兼容。如果满足以下要求之一,则尺寸兼容:

  1. They are equal 他们是平等的
  2. One of the dimensions is size 1
    其中一个尺寸是尺寸 1

When you attempt to broadcast incompatible tensors, you’ll be met with the following runtime error:
当您尝试广播不兼容的张量时,您会遇到以下运行时错误:

iex> Nx.add(Nx.tensor([[1, 2, 3], [4, 5, 6]]), Nx.tensor([[1, 2], [3, 4]]))
** (ArgumentError) cannot broadcast tensor of dimensions {2, 3} to {2, 2}
    (nx 0.1.0-dev) lib/nx/shape.ex:241: Nx.Shape.binary_broadcast/4
    (nx 0.1.0-dev) lib/nx.ex:2430: Nx.element_wise_bin_op/4

        
          
        
      

If necessary, you can get around broadcasting issues by expanding, padding, or slicing one of the input tensors; however, you should carefully consider how this might affect the outcome of your algorithm.
如有必要,您可以通过扩展、填充或切片输入张量之一来解决广播问题;但是,您应该仔细考虑这可能会如何影响算法的结果。

Basic Linear Regression 基本线性回归

So far, we’ve spent all of our time in iex with trivial examples and demonstrations of tensor operations. All of our work could have been done with some clever use of Enum and lists. In this section, we’ll start to unlock some of the real power of Nx by solving a basic linear regression problem using gradient descent.
到目前为止,我们在 iex 中的所有时间都花在了张量运算的简单示例和演示上。我们所有的工作都可以通过巧妙地使用 Enum 和列表来完成。在本节中,我们将通过使用梯度下降解决一个基本的线性回归问题来开始释放 Nx 的一些真正力量。

You’ll want to start by creating a new Mix project that imports Nx, as well as an Nx compiler or backend. In this example, I’ll be using EXLA; however, you can use Torchx with some minor adjustments to this example. There are some fundamental differences between EXLA and Torchx that are outside the scope of this post; however, both of them will work fine for this example.
您需要首先创建一个导入 Nx 以及 Nx 编译器或后端的新 Mix 项目。在这个例子中,我将使用 EXLA;但是,您可以使用 Torchx 对此示例进行一些小的调整。 EXLATorchx 之间存在一些根本区别,不在本文讨论范围之内;但是,对于此示例,它们都可以正常工作。

At the time of this writing, Nx is still not available on Hex, so you’ll need to use a Git dependency in your mix.exs:
在撰写本文时, Nx 在 Hex 上仍然不可用,因此您需要在 mix.exs 中使用 Git 依赖项:

def deps do
  [
    {:exla, "~> 0.1.0-dev", github: "elixir-nx/nx", sparse: "exla"},
    {:nx, "~> 0.1.0-dev", github: "elixir-nx/nx", sparse: "nx", override: true}
  ]
end

        
          
        
      

Now you can run:
现在你可以运行:

$ mix deps.get && mix deps.compile

        
          
        
      

If this is your first time compiling EXLA, it will take quite a bit of time on the first compilation. You’ll also want to take a look at the installation section of the EXLA README for prerequisites and troubleshooting steps.
如果这是您第一次编译 EXLA ,那么第一次编译会花费相当多的时间。您还需要查看 EXLA 自述文件的安装部分,了解先决条件和故障排除步骤。

Once both Nx and EXLA are compiled, create a new file, regression.exs somewhere inside your Mix project. Inside regression.exs, create a module and import Nx.Defn:
NxEXLA 都编译完成后,在 Mix 项目中的某处创建一个新文件 regression.exs 。在 regression.exs 中,创建一个模块并导入 Nx.Defn

defmodule LinReg do
  import Nx.Defn
end

        
          
        
      

Nx.Defn is a module that contains the Nxdefn definition. defn is a macro for declaring numerical definitions. Numerical definitions work just the same as regular Elixir functions; however, they support a limited subset of the Elixir programming language, in favor of supporting JIT compilation to accelerators such as GPUs. defn also replaces much of the Elixir kernel with Nx specific implementations. As an example:
Nx.Defn 是一个包含 Nx defn 定义的模块。 defn 是一个用于声明数字定义的宏。数字定义与常规 Elixir 函数一样工作;但是,它们支持 Elixir 编程语言的有限子集,以支持对 GPU 等加速器的 JIT 编译。 defn 还用 Nx 特定实现替换了大部分 Elixir 内核。举个例子:

defn add_two(a, b) do
  a + b
end

        
          
        
      

will work on both tensors and scalars, because + internally resolves to Nx.add/2. defn also has support for a special transformation: grad. grad is a macro that returns the gradient of a function with respect to some provided parameters. The gradient of a function provides information about the rate of change of a function with respect to some parameters. A complete discussion of gradients falls outside the scope of this post - for now, you’ll just need to know how to use grad, and what it means at a high-level.
将适用于张量和标量,因为 + 在内部解析为 Nx.add/2defn 还支持一个特殊的转换: gradgrad 是一个宏,它返回函数相对于某些提供的参数的梯度。函数的梯度提供有关函数相对于某些参数的变化率的信息。对渐变的完整讨论超出了本文的范围 —— 现在,您只需要知道如何使用 grad ,以及它在较高层次上的含义。

As I mentioned before, we’ll be implementing a basic linear regression model using gradient descent. Linear regression is an approach to modeling the relationship between some number of input variables and an output variable. The input variables are called the explanatory variables because they are assumed to have a causal relationship which explains the behavior of an output variable. As a practical example, imagine you want to predict the number of daily average users to your website based on the month, time of day, and whether or not there is an ongoing promotion on the website. You can collect data over the course of several months, and then use this data to fit a basic regression model that predicts daily average users for you.
正如我之前提到的,我们将使用梯度下降来实现一个基本的线性回归模型。线性回归是一种对一定数量的输入变量和输出变量之间的关系进行建模的方法。输入变量称为解释变量,因为它们被假定具有解释输出变量行为的因果关系。举一个实际的例子,假设您想根据月份、一天中的时间以及网站上是否正在进行促销来预测您网站的每日平均用户数。你可以收集几个月的数据,然后使用这些数据来拟合一个基本的回归模型,为你预测每日平均用户。

In our example, we’ll create a regression model that predicts an output variable with respect to 1 input variable. We’ll start by defining our training set outside of the LinReg module:
在我们的示例中,我们将创建一个回归模型来预测关于 1 个输入变量的输出变量。我们将从在 LinReg 模块之外定义我们的训练集开始:

target_m = :rand.normal(0.0, 10.0)
target_b = :rand.normal(0.0, 5.0)
target_fn = fn x -> target_m * x + target_b end
data =
  Stream.repeatedly(fn -> for _ <- 1..32, do: :rand.uniform() * 10 end)
  |> Stream.map(fn x -> Enum.zip(x, Enum.map(x, target_fn)) end)
IO.puts("Target m: #{target_m}\tTarget b: #{target_b}\n")

        
          
        
      

First, we define target_m, target_b and target_fn. Our linear function has the form: y = m*x + b, so we create a Stream that repeatedly generates batches of input and output pairs by applying target_fn on random inputs. Our goal is to learn target_m and target_b using gradient descent.
首先,我们定义 target_mtarget_btarget_fn 。我们的线性函数的形式为: y = m*x + b ,因此我们创建了一个 Stream ,它通过对随机输入应用 target_fn 来重复生成成批的输入和输出对。我们的目标是使用梯度下降来学习 target_mtarget_b

The next thing we need to define is our model. A model is really just a parameterized function that maps inputs to outputs. We know our function should have the form y = m * x + b, so we can easily define our model in the same way:
接下来我们需要定义的是我们的模型。模型实际上只是一个将输入映射到输出的参数化函数。我们知道我们的函数应该有 y = m * x + b 的形式,所以我们可以很容易地用同样的方式定义我们的模型:

defmodule LinReg do
  import Nx.Defn
  defn predict({m, b}, x) do
    m * x + b
  end
end

        
          
        
      

Next, we need to define a loss function. Loss functions evaluate predictions with respect to true data, often to measure the divergence between a model’s representation of the data-generating distribution and the true representation of the data-generating distribution. This essentially means that loss functions tell you how poor your model performs. The goal is to minimize your loss function by fitting a function to a target function.
接下来,我们需要定义一个损失函数。损失函数评估关于真实数据的预测,通常用于衡量模型对数据生成分布的表示与数据生成分布的真实表示之间的差异。这实质上意味着损失函数告诉您模型的性能有多差。目标是通过将函数拟合到目标函数来最小化损失函数。

With linear regression problems, it’s most common to use mean-squared error (MSE) as the loss function:
对于线性回归问题,最常见的是使用均方误差 (MSE) 作为损失函数:

defn loss(params, x, y) do
  y_pred = predict(params, x)
  Nx.mean(Nx.power(y - y_pred, 2))
end

        
          
        
      

MSE measures the average squared difference between our targets and predictions. As our predictions get closer to our targets, MSE tends towards 0. Given our loss function, we need a way to update our model such that it minimizes loss/3. We can achieve this using gradient descent. Gradient descent calculates the gradient of a loss function with respect to the input parameters. The gradient then provides information on how to update model parameters.
MSE 测量我们的目标和预测之间的平均平方差。随着我们的预测越来越接近我们的目标,MSE 趋向于 0。鉴于我们的损失函数,我们需要一种方法来更新我们的模型,使其最小化 loss/3 。我们可以使用梯度下降来实现这一点。梯度下降计算损失函数相对于输入参数的梯度。梯度然后提供有关如何更新模型参数的信息。

It can be difficult to understand exactly what gradient descent is doing at first. Imagine you want to find the deepest point in a lake. You have a depth finder on your boat, but no other information. You could search over the entire lake; however, this would take an impossible amount of time. Instead, you can use your depth finder to iteratively find deeper and deeper points in smaller areas of the lake. For example, if you know traveling left increases depth from 5 to 7 meters and traveling right decreases depth from 5 to 3 meters, you would choose to move your boat left. This is, in essence, what gradient descent is doing - it gives you depth-finding information you can use to navigate a parameter space.
一开始可能很难准确理解梯度下降的作用。想象一下,您想找到湖中的最深点。您的船上有测深仪,但没有其他信息。你可以搜索整个湖;但是,这将花费不可能的时间。相反,您可以使用测深仪在较小的湖泊区域中反复寻找越来越深的点。例如,如果您知道向左行驶会使深度从 5 米增加到 7 米,而向右行驶会使深度从 5 米减少到 3 米,您会选择向左移动您的船。从本质上讲,这就是梯度下降所做的事情 —— 它为您提供可用于导航参数空间的深度查找信息。

You can implement your update state by calculating the gradient with respect to your loss function, and using the gradient to update each parameter, like this:
您可以通过计算损失函数的梯度来实现更新状态,并使用梯度来更新每个参数,如下所示:

defn update({m, b} = params, inp, tar) do
  {grad_m, grad_b} = grad(params, &loss(&1, inp, tar))
  {
    m - grad_m * 0.01,
    b - grad_b * 0.01
  }
end

        
          
        
      

grad takes the parameters you want to evaluate the gradient at, as well as a parameterized function - in this case the loss function. grad_m and grad_b are the gradients of m and b respectively. You use the gradients to update m by scaling the gradients by a factor of 0.01 and then subtracting this value from m. 0.01 is also called the learning rate. You want to take small steps; large jumps cause you to move too erratically within the parameter space and inhibit learning.
grad 采用你想要评估梯度的参数,以及一个参数化函数 —— 在本例中是损失函数。 grad_mgrad_b 分别是 mb 的梯度。您可以使用梯度来更新 m ,方法是将梯度按 0.01 的比例缩放,然后从 m 中减去该值。 0.01 也叫学习率。您想采取小步骤;大跳跃会导致您在参数空间内移动过于不稳定并抑制学习。

update returns the updated parameters m and b. At this point, however, you need some initial starting point for both m and b. Revisiting the depth-finding example, imagine you have a friend who has some intuition about the general location of the deepest point in the lake. He tells you where to start your search, and thus you have a better shot at finding the deepest point. This is essentially the same as parameter initialization. You need to have a good starting point in order to effectively learn a good parameterization of your model:
update 返回更新后的参数 mb 。然而,此时您需要为 mb 提供一些初始起点。重温一下测深的例子,假设你有一个朋友,他对湖中最深处的大致位置有一些直觉。他会告诉您从哪里开始搜索,因此您可以更好地找到最深点。这与参数初始化基本相同。您需要有一个好的起点才能有效地学习模型的良好参数化:

defn init_random_params do
  m = Nx.random_normal({}, 0.0, 0.1)
  b = Nx.random_normal({}, 0.0, 0.1)
  {m, b}
end

        
          
        
      

init_random_params uses Nx.random_normal/3 to initialize m and b using a normal distribution with mean 0.0 and standard deviation 0.1. Now, you need to write a training loop. A training loop takes batches of examples, and applies update after each batch, halting only after some condition is met. In this example, we’ll train on 200 batches for a total of 10 epochs or full training iterations:
init_random_params 使用 Nx.random_normal/3 初始化 mb 使用均值 0.0 和标准差 0.1 的正态分布。现在,您需要编写一个训练循环。训练循环采用多批示例,并在每批之后应用 update ,仅在满足某些条件后才停止。在这个例子中,我们将训练 200 个批次,总共 10 个时期或完整的训练迭代:

def train(epochs, data) do
  init_params = init_random_params()
  for _ <- 1..epochs, reduce: init_params do
    acc ->
      data
      |> Enum.take(200)
      |> Enum.reduce(
        acc,
        fn batch, cur_params ->
          {inp, tar} = Enum.unzip(batch)
          x = Nx.tensor(inp)
          y = Nx.tensor(tar)
          update(cur_params, x, y)
        end
      )
  end
end

        
          
        
      

In the training loop, we take 200 batches from the stream and iteratively update the model parameters after each batch. We repeat this process epochs number of times, returning the updated params after every epoch. Now, we just need to call LinReg.train/2 to return the learned m and b:
在训练循环中,我们从流中取出 200 个批次,并在每个批次后迭代更新模型参数。我们重复此过程 epochs 次,在每个纪元后返回更新的 params 。现在,我们只需要调用 LinReg.train/2 来返回学习到的 mb

{m, b} = LinReg.train(100, data)
IO.puts("Learned m: #{Nx.to_scalar(m)}\tLearned b: #{Nx.to_scalar(b)}")

        
          
        
      

Overall, regression.exs should now look like:
总的来说, regression.exs 现在应该是这样的:

defmodule LinReg do
  import Nx.Defn
  defn predict({m, b}, x) do
    m * x + b
  end
  defn loss(params, x, y) do
    y_pred = predict(params, x)
    Nx.mean(Nx.power(y - y_pred, 2))
  end
  defn update({m, b} = params, inp, tar) do
    {grad_m, grad_b} = grad(params, &loss(&1, inp, tar))
    {
      m - grad_m * 0.01,
      b - grad_b * 0.01
    }
  end
  defn init_random_params do
    m = Nx.random_normal({}, 0.0, 0.1)
    b = Nx.random_normal({}, 0.0, 0.1)
    {m, b}
  end
  def train(epochs, data) do
    init_params = init_random_params()
    for _ <- 1..epochs, reduce: init_params do
      acc ->
        data
        |> Enum.take(200)
        |> Enum.reduce(
          acc,
          fn batch, cur_params ->
            {inp, tar} = Enum.unzip(batch)
            x = Nx.tensor(inp)
            y = Nx.tensor(tar)
            update(cur_params, x, y)
          end
        )
    end
  end
end
target_m = :rand.normal(0.0, 10.0)
target_b = :rand.normal(0.0, 5.0)
target_fn = fn x -> target_m * x + target_b end
data =
  Stream.repeatedly(fn -> for _ <- 1..32, do: :rand.uniform() * 10 end)
  |> Stream.map(fn x -> Enum.zip(x, Enum.map(x, target_fn)) end)
IO.puts("Target m: #{target_m}\tTarget b: #{target_b}\n")
{m, b} = LinReg.train(100, data)
IO.puts("Learned m: #{Nx.to_scalar(m)}\tLearned b: #{Nx.to_scalar(b)}")

        
          
        
      

Now, you can run this example:
现在,您可以运行此示例:

$ mix run regression.exs
Target m: -0.057762353079829236 Target b: 0.681480460783122
Learned m: -0.05776193365454674 Learned b: 0.6814777255058289

        
          
        
      

Notice how our learned m and b are almost identical to the target m and b! We’ve successfully implemented linear regression using gradient descent; however, there’s one thing we can do to take this implementation to the next level.
请注意我们学习的 mb 与目标 mb 几乎相同!我们已经成功地使用梯度下降实现了线性回归;但是,我们可以做一件事来将此实现提升到一个新的水平。

You should have noticed that training for 100 epochs took a noticeable amount of time. That’s because we’re not taking advantage of defn JIT compilation with EXLA. Because this is a relatively small example, we don’t really need JIT compilation; however, you will want to accelerate your models as your implementations get more complex. First, so we can really see the difference between EXLA and pure Elixir, let’s time how long model training takes:
您应该已经注意到,训练 100 个 epoch 需要花费大量时间。那是因为我们没有利用 EXLAdefn JIT 编译。因为这是一个比较小的例子,我们真的不需要 JIT 编译;但是,随着您的实现变得越来越复杂,您将希望加速您的模型。首先,为了让我们真正看到 EXLA 和纯 Elixir 之间的区别,让我们计算一下模型训练需要多长时间:

{time, {m, b}} = :timer.tc(LinReg, :train, [100, data])
IO.puts("Learned m: #{Nx.to_scalar(m)}\tLearned b: #{Nx.to_scalar(b)}\n")
IO.puts("Training time: #{time / 1_000_000}s")

        
          
        
      

and then run again without any acceleration:
然后在没有任何加速的情况下再次运行:

$ mix run regression.exs
Target m: -1.4185910271067492 Target b: -2.9781437461823965
Learned m: -1.4185925722122192  Learned b: -2.978132724761963
Training time: 4.460695s

        
          
        
      

Once again, we successfully learned m and b. This time, we can see that training took about 4.5 seconds. Now, in order to take advantage of JIT compilation using EXLA, add the following attribute to your module:
又一次,我们成功学习了 mb 。这一次,我们可以看到训练大约花费了 4.5 秒。现在,为了利用使用 EXLA 的 JIT 编译,将以下属性添加到您的模块:

defmodule LinReg do
  import Nx.Defn
  @default_defn_compiler EXLA
end

        
          
        
      

This tells Nx to use the EXLA compiler to compile all of the numerical definitions in the module. Now, run the example again:
这告诉 Nx 使用 EXLA 编译器来编译模块中的所有数值定义。现在,再次运行示例:

Target m: -3.1572039775886167 Target b: -1.9610560589959405
Learned m: -3.1572046279907227  Learned b: -1.961051106452942
Training time: 2.564152s

        
          
        
      

The results are exactly the same, but we we’re able to train in 2.6 seconds versus 4.5 seconds - an almost 60% speedup! Admittedly, this is a relatively trivial example, and the speedup you’re seeing here is only a fraction of what you would see with more complex implementations. As an example, you can attempt to run a pure Elixir implementation of the MNIST example in the EXLA repository and a single epoch will take hours to complete whereas the EXLA-compiled version will complete in anywhere from 0.5s to 4s per epoch - depending on the accelerator and machine you’re using.
结果完全相同,但我们能够在 2.6 秒内完成训练,而不是 4.5 秒 —— 几乎提速了 60%!不可否认,这是一个相对微不足道的示例,您在这里看到的加速只是您在更复杂的实现中看到的加速的一小部分。例如,您可以尝试在 EXLA 存储库中运行 MNIST 示例的纯 Elixir 实现,单个纪元将需要数小时才能完成,而 EXLA 编译版本将在每个纪元 0.5 到 4 秒的时间内完成 -取决于您使用的加速器和机器。

Conclusion 结论

This post covered a lot of the Nx core functionality. You learned:
这篇文章涵盖了很多 Nx 核心功能。你学到了:

  1. How to create tensors using Nx.tensor and Nx.from_binary
    如何使用 Nx.tensorNx.from_binary 创建张量
  2. How to use unary, binary, and aggregate operations to manipulate tensors
    如何使用一元、二元和聚合运算来操作张量
  3. How to implement gradient descent using defn and Nx automatic differentiation with grad
    如何使用 defnNx 自动微分与 grad 实现梯度下降
  4. How to accelerate numerical definitions using the EXLA compiler
    如何使用 EXLA 编译器加速数值定义

While this post covered the basics of what’s needed to get started with Nx, there’s still much more to learn. I hope this post motivates you to continue learning about the Nx project and inspires you to seek out unique applications of Nx in practice. Nx is still in its infancy, and there are many more exciting things ahead!
虽然这篇文章涵盖了开始使用 Nx 所需的基础知识,但还有很多东西需要学习。我希望这篇文章能激励您继续了解 Nx 项目,并激励您在实践中寻找 Nx 的独特应用。 Nx 还在起步阶段,后面还有很多精彩!

DockYard is a digital product agency offering custom software, mobile, and web application development consulting. We provide exceptional professional services in strategy, user experience, design, and full stack engineering using Ember.js, React.js, Elixir, Ruby, and other technologies. With staff nationwide, we’ve got consultants in key markets across the U.S., including Portland, San Francisco, Los Angeles, Salt Lake City, Minneapolis, Dallas, Miami, Washington D.C., and Boston.
DockYard 是一家数字产品代理机构,提供定制软件、移动和 Web 应用程序开发咨询服务。我们使用 Ember.js、React.js、Elixir、Ruby 和其他技术在战略、用户体验、设计和全栈工程方面提供卓越的专业服务。我们的员工遍布全国,在美国主要市场拥有顾问,包括波特兰、旧金山、洛杉矶、盐湖城、明尼阿波利斯、达拉斯、迈阿密、华盛顿特区和波士顿。

Nx for Absolute Beginners
绝对初学者的 Nx

Colorful wooden blocks stacked into a block

Machine Learning Advisor
机器学习顾问

Sean Moriarity  肖恩·莫里亚蒂

Introduction 介绍

We just recently released Nx v0.1.0. In honor of the release, today’s post breaks down the absolute basics of Nx. If you’re interested in learning more about Nx, machine learning in Elixir, and driving the ecosystem forward, join us in the Erlang Ecosystem Foundation ML Working Group Slack.
我们最近刚刚发布了 Nx v0.1.0。为了纪念该版本,今天的帖子详细介绍了 Nx 的绝对基础知识。如果您有兴趣了解更多关于 Nx、Elixir 中的机器学习以及推动生态系统向前发展的信息,请加入我们的 Erlang 生态系统基金会 ML 工作组 Slack。

Nx (Numerical Elixir) is a library for creating and manipulating multidimensional arrays. It is intended to serve as the core of numerical computing and data science in the Elixir ecosystem. Programming in Nx requires a bit of a different way of thinking. If you’re familiar with the Python ecosystem, Nx will remind you a lot of NumPy. While this is true, there are some key differences - mostly due to the difference in language constructs between Elixir and Python. As one example, Nx tensors are completely immutable.
Nx (Numerical Elixir) 是一个用于创建和操作多维数组的库。它旨在作为 Elixir 生态系统中数值计算和数据科学的核心。在 Nx 中编程需要一些不同的思维方式。如果您熟悉 Python 生态系统, Nx 会让您想起很多 NumPy。虽然这是事实,但存在一些关键差异 —— 主要是由于 Elixir 和 Python 之间语言结构的差异。例如,Nx 张量是完全不可变的。

At the core of Nx is the Nx.Tensor. The Nx.Tensor is analogous to the NumPy ndarray or TensorFlow/PyTorch Tensor objects. It is the main data structure the Nx library is designed to manipulate. All of the Nx functionality such as gradient computations, just-in-time compilation, pluggable backends, etc. are built on top of implementations of the Nx.Tensor behavior.
Nx 的核心是 Nx.TensorNx.Tensor 类似于 NumPy ndarray 或 TensorFlow/PyTorch Tensor 对象。它是 Nx 库旨在操作的主要数据结构。所有 Nx 功能,如梯度计算、即时编译、可插入后端等,都构建在 Nx.Tensor 行为实现之上。

In this post, we’ll go over what exactly an Nx.Tensor is, how to create them, and how to manipulate them. This post intentionally ignores some of the more in-depth offerings within the Nx API in order to focus on the basics. If you’re interested in learning more, I suggest checking out the Nx documentation and following myself and DockYard on Twitter to stay up to date on the latest Nx content.
在这篇文章中,我们将详细介绍 Nx.Tensor 到底是什么、如何创建它们以及如何操作它们。这篇文章故意忽略 Nx API 中一些更深入的产品,以便专注于基础知识。如果您有兴趣了解更多信息,我建议您查看 Nx 文档并在 Twitter 上关注我自己和 DockYard,以了解最新的 Nx 内容。

Installation 安装

Nx is a regular Elixir library, so you can install it in the same way you would any other Elixir library. Since this post is designed for you to follow along in a Livebook, we’ll use Mix.install:
Nx 是一个常规的 Elixir 库,因此您可以像安装任何其他 Elixir 库一样安装它。由于这篇文章是专为您在 Livebook 中跟进而设计的,我们将使用 Mix.install

Mix.install([
  {:nx, "~> 0.1.0"}
])

        
          
        
      

Lists vs. Tensors 列表与张量

When you first create and inspect a tensor, you’re probably inclined to think of it as a list or a nested list of numbers:
当您第一次创建和检查张量时,您可能倾向于将其视为一个列表或嵌套的数字列表:

Nx.tensor([[1, 2, 3], [4, 5, 6]])

        
          
        
      
#Nx.Tensor<
  s64[2][3]
  [
    [1, 2, 3],
    [4, 5, 6]
  ]
>

That line of thinking is reasonable - after all, inspecting the values yields a nested list representation of the tensor! The truth, though, is that this visual representation is just a matter of convenience. Thinking of a tensor as a nested list is misleading and might cause you to have a difficult time grasping some of the fundamental concepts in Nx.
这种思路是合理的 —— 毕竟,检查值会产生张量的嵌套列表表示!但事实是,这种视觉表示只是为了方便。将张量视为嵌套列表是一种误导,可能会使您难以掌握 Nx 中的一些基本概念。

The Nx.Tensor is a data structure with four key fields:
Nx.Tensor 是一个包含四个关键字段的数据结构:

  • :data
  • :shape
  • :type
  • :names

Let’s look at each of these fields in-depth.
让我们深入研究这些领域中的每一个。

Tensors have data 张量有数据

In order to perform any computations at all, tensors need to have some underlying data which contain its values. The most common way to represent a tensor’s data is with a flat VM binary - essentially just an array of bytes. This is an important implementation detail; Nx mostly operates on the raw bytes which represent individual values in a tensor. Those values are stored in a flat container - Nx doesn’t operate on lists or nested lists.
为了执行任何计算,张量需要有一些包含其值的基础数据。表示张量数据的最常见方法是使用平面 VM 二进制文件 —— 本质上只是一个字节数组。这是一个重要的实现细节; Nx 主要对代表张量中各个值的原始字节进行操作。这些值存储在一个平面容器中 —— Nx 不对列表或嵌套列表进行操作。

Binaries are just C byte arrays, so we’re able to perform some very efficient operations on large tensors. While this gives us a nice performance boost, it also constrains us. Our tensor operations need to know what type the byte values represent in order to perform operations correctly. This means every value in a tensor must have the same type.
二进制文件只是 C 字节数组,因此我们能够对大型张量执行一些非常有效的操作。虽然这给了我们很好的性能提升,但它也限制了我们。我们的张量运算需要知道 type 字节值代表什么才能正确执行运算。这意味着张量中的每个值都必须具有相同的类型。

Finally, the choice of representing tensor data as a flat binary leads to some interesting (and annoying) scenarios to consider. At the very least, we need to be conscious of endianness - you can’t guarantee the raw byte values of a tensor will be interpreted the same way on different machines.
最后,将张量数据表示为平面二进制文件的选择会导致一些有趣(且烦人)的场景需要考虑。至少,我们需要意识到字节顺序 —— 你不能保证张量的原始字节值在不同的机器上会以相同的方式被解释。

Nx.tensor([[1, 2, 3], [4, 5, 6]]) |> Nx.to_binary()

        
          
        
      
<<1, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 3, 0, 0, 0, 0, 0, 0, 0, 4, 0,
  0, 0, 0, 0, 0, 0, 5, 0, 0, 0, 0, 0, 0, 0, 6, 0, 0, 0, 0, 0, 0, 0>>
Nx.tensor([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]]) |> Nx.to_binary()

        
          
        
      
<<0, 0, 128, 63, 0, 0, 0, 64, 0, 0, 64, 64, 0, 0, 128, 64, 0, 0, 160, 64, 0, 0,
  192, 64>>

Tensors have shape  张量有形状

The “nested list” representation you see when inspecting a tensor is actually a manifestation of its shape. A tensor’s shape is best described as the size of each dimension. While two tensors might have the same underlying data, they can have different shapes, which fundamentally change the nature of the operations performed on them.
您在检查张量时看到的“嵌套列表”表示实际上是其 shape 的表现形式。张量的形状最好描述为每个维度的大小。虽然两个张量可能具有相同的基础数据,但它们可以具有不同的形状,这从根本上改变了对它们执行的操作的性质。

We describe a tensor’s shape with a tuple of integers: {size_d1, size_d2, ..., size_dn}. For example, if a tensor has a shape {2, 1, 2}, it means the tensor’s first dimension has size 2, second dimension has size 1, and third dimension has size 2:
我们用整数元组描述张量的形状: {size_d1, size_d2, ..., size_dn} 。例如,如果张量的形状为 {2, 1, 2} ,则表示张量的第一维大小为 2,第二维大小为 1,第三维大小为 2:

Nx.tensor([[[1, 2]], [[3, 4]]])

        
          
        
      
#Nx.Tensor<
  s64[2][1][2]
  [
    [
      [1, 2]
    ],
    [
      [3, 4]
    ]
  ]
>

We can also describe the number of dimensions in a tensor as its rank. As you start to work more in the scientific computing space, you’ll inevitably come across descriptions of shape which reference 0-D shapes as scalars:
我们还可以将张量中的维数描述为它的 rank 。随着您开始更多地从事科学计算领域的工作,您将不可避免地遇到将 0-D 形​​状引用为标量的形状描述:

Nx.tensor(1)

        
          
        
      
#Nx.Tensor<
  s64
  1
>

1-D shapes as vectors:
作为向量的一维形状:

Nx.tensor([1, 2, 3])

        
          
        
      
#Nx.Tensor<
  s64[3]
  [1, 2, 3]
>

2-D shapes as matrices:
二维形状作为矩阵:

Nx.tensor([[1, 2, 3], [4, 5, 6]])

        
          
        
      
#Nx.Tensor<
  s64[2][3]
  [
    [1, 2, 3],
    [4, 5, 6]
  ]
>

and so on.
等等。

Those descriptions aren’t inaccurate, but if you have experience with advanced mathematics, the notation in Nx will probably confuse you. This is another important note - the choice terms and notation in Nx such as rank and tensor are that of ubiquity in the numerical computing space and not of mathematical correctness.
这些描述并非不准确,但如果您有高等数学方面的经验, Nx 中的符号可能会让您感到困惑。这是另一个重要说明 - Nx 中的选择术语和符号,例如 ranktensor 是数值计算空间中普遍存在的术语和符号,而不是数学正确性。

Practically speaking, a tensor’s shape tells us 2 things:
实际上,张量的形状告诉我们两件事:

  1. How to traverse and index a tensor
    如何遍历和索引张量
  2. How to perform shape-dependent operations
    如何执行依赖于形状的操作

Theoretically, we could write all of our operations to work on a flat binary, but that doesn’t map very well to the real-world. We reason about things with dimensionality. Let’s consider the example of an image. A common representation of images in numerical computing is {color_channels, height, width}. A 32x32 RGB image will have shape {3, 32, 32}. Now imagine if you were asked to access the green value of the pixel at height 5 and width 17. If you have no understanding of the tensor’s shape, this would be an impossible task. However, since you do know the shape, you just need to perform a few calculations and you’ll be able to very efficiently access any value in the tensor.
从理论上讲,我们可以编写所有操作以在平面二进制文件上运行,但这并不能很好地映射到现实世界。我们对具有维度的事物进行推理。让我们考虑一个图像的例子。数值计算中图像的常见表示是 {color_channels, height, width} 。 32x32 RGB 图像的形状为 {3, 32, 32} 。现在想象一下,如果你被要求访问高度为 5、宽度为 17 的像素的绿色值。如果你不了解张量的形状,这将是一项不可能完成的任务。但是,由于您确实知道形状,因此只需执行一些计算,就可以非常高效地访问张量中的任何值。

To access a tensor’s shape, you can use Nx.shape:
要访问张量的形状,可以使用 Nx.shape

Nx.shape(Nx.tensor([[1, 2, 3], [4, 5, 6]]))

        
          
        
      
{2, 3}

To access its rank, you can use Nx.rank:
要访问其排名,您可以使用 Nx.rank

Nx.rank(Nx.tensor([[1, 2, 3], [4, 5, 6]]))

        
          
        
      
2

Tensors have names 张量有名字

As a consequence of working in multiple dimensions, you often want to perform operations only on certain dimensions of an input tensor. Some Nx functions give you the option to specify an axis or axes to reduce, permute, traverse, slice, etc. The norm is to access axes by their index in a tensor’s shape. For example, axis 1 in shape {2, 6, 3} is of size 6. Unfortunately, writing code that relies on integer axis values is fragile, and difficult to read. One problem you’ll often run into is the choice of channels-first or channels-last tensor representations of images. In a channels-first configuration, the shape of an image is {channels, height, width}. In a channels-last configuration, the shape of an image is {height, width, channels}. Now consider that I write code which computes a grayscale representation by taking the maximum color value along an image’s color channels. If I write my code like:
由于在多个维度上工作,您通常希望仅对输入张量的某些维度执行操作。一些 Nx 函数让您可以选择指定 axisaxes 来减少、置换、遍历、切片等。规范是通过张量形状中的索引访问轴。例如,形状为 {2, 6, 3} 的轴 1 的大小为 6 。不幸的是,编写依赖于整数轴值的代码很脆弱,而且难以阅读。您经常会遇到的一个问题是选择通道优先或通道最后的图像张量表示。在通道优先配置中,图像的形状是 {channels, height, width} 。在 channels-last 配置中,图像的形状是 {height, width, channels} 。现在考虑我编写的代码通过沿图像的颜色通道获取最大颜色值来计算灰度表示。如果我这样写我的代码:

defn grayscale(img), do: Nx.reduce_max(img, axes: [0])

        
          
        
      

It breaks if somebody attempts to use a channels-last representation! Nx remedies this with named tensors. Named tensors give semantic meaning to the axes of a tensor. We can more accurately describe an image’s shape with the keyword list [channels: 3, height: 32, width: 32]. This affords you the ability to write code like this:
如果有人试图使用 channels-last 表示,它就会崩溃! Nx 用命名张量补救这个问题。命名张量为张量的轴赋予语义意义。我们可以使用关键字列表 [channels: 3, height: 32, width: 32] 来更准确地描述图像的形状。这使您能够编写如下代码:

defn grayscale(img), do: Nx.reduce_max(img, axes: [:channels])

        
          
        
      

To learn more about what named tensors offer, I suggest you read Tensor considered harmful which describes the initial idea.
要了解有关命名张量提供的更多信息,我建议您阅读 Tensor considered harmful,其中描述了最初的想法。

To access the list of axes in a tensor, you can use Nx.axes:
要访问张量中的轴列表,可以使用 Nx.axes

Nx.axes(Nx.tensor([[1, 2, 3], [4, 5, 6]]))

        
          
        
      
[0, 1]

To access the list of names in a tensor, you can use Nx.names:
要访问张量中的名称列表,可以使用 Nx.names

Nx.names(Nx.tensor([[1, 2, 3], [4, 5, 6]], names: [:x, :y]))

        
          
        
      
[:x, :y]

Tensors have a type
张量有一个类型

As mentioned before, a consequence of operating on binaries is the need to have tensors with homogenous types. In other words, every value in the tensor must be the same type. This is important for efficiency, which is why tensors exist - to support efficient, parallel computation. If we know that every value in a 1-D tensor is 16 bits long in memory and that the tensor is 128 bits long, we can quickly calculate that there are 8 values in it—128 / 16 = 8. We can also easily grab individual values for parallel calculation because we know that there’s a new value every 16 bits. Imagine if this weren’t the case; that is, if the first value were 8 bits long, the second value 32 bits, and so on. To count the items or divide them into groups, we’d have to walk through the entire tensor every time (a waste of time), and each value would have to declare its length (a waste of space). All tensors are instantiated with a datatype which describes their type and size. The type is represented as a tuple of {:type, size}.
如前所述,对二进制文件进行操作的结果是需要具有同质类型的张量。换句话说,张量中的每个值都必须是同一类型。这对于效率很重要,这就是张量存在的原因 —— 支持高效的并行计算。如果我们知道一维张量中的每个值在内存中都是 16 位长,而张量是 128 位长,我们可以快速计算出其中有 8 个值 —— 128 / 16 = 8。我们也可以很容易地抓取用于并行计算的单个值,因为我们知道每 16 位有一个新值。想象一下,如果不是这样的话;也就是说,如果第一个值是 8 位长,第二个值是 32 位,依此类推。要对项目进行计数或将它们分组,我们每次都必须遍历整个张量(浪费时间),并且每个值都必须声明其长度(浪费空间)。所有张量都使用描述其 typesize 的数据类型实例化。该类型表示为 {:type, size} 的元组。

Valid types are: 有效类型是:

  • :f - floating point types
    :f - 浮点类型
  • :s - signed integer types
    :s - 有符号整数类型
  • :u - unsigned integer types
    :u - 无符号整数类型
  • :bf - brain-floating point types
    :bf - 大脑浮点类型

Valid sizes are: 有效尺寸为:

  • 8, 16, 32, 64 for signed and unsigned integer types
    8、16、32、64 用于有符号和无符号整数类型
  • 16, 32, 64 for floating point types
    16、32、64 用于浮点类型
  • 16 for brain floating point types
    16 用于大脑浮点类型

The size of the type more accurately describes its precision. While 64-bit types consume more memory and are slower to operate on, they are more precise than their 32-bit counterparts. The default integer type in Nx is {:s, 64}. The default float type is {:f, 32}. When creating tensors with values that are mixed, Nx will promote the values to the “highest” type, preferring to (for example) waste some space by representing a 16-bit float in 32 bits than to lose some of the information in a 32-bit float by chopping it to 16 bits. This is called type promotion. Type promotion is outside the scope of this post, but it’s something to be aware of.
类型的大小更准确地描述了它的精度。虽然 64 位类型消耗更多内存且操作速度较慢,但​​它们比 32 位类型更精确。 Nx 中的默认整数类型是 {:s, 64} 。默认的浮点类型是 {:f, 32} 。当创建具有混合值的张量时, Nx 会将值提升为“最高”类型,宁愿(例如)通过在 32 位中表示 16 位浮点数来浪费一些空间,也不愿丢失一些信息一个 32 位浮点数,将其切成 16 位。这称为类型提升。类型提升不在本文讨论范围之内,但需要注意这一点。

You can get the type of a tensor with Nx.type:
您可以使用 Nx.type 获取张量的类型:

Nx.type(Nx.tensor([[1, 2, 3], [4, 5, 6]]))

        
          
        
      
{:s, 64}
Nx.type(Nx.tensor([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]]))

        
          
        
      
{:f, 32}

As you can see, tensors are not lists. Tensors are a data structure designed to do one thing well: crunch numbers. Lists are much more general purpose. While I have no doubt you could probably perform the same operations the Nx API implements on nested lists, it would be a nightmare—even more so than it already was to write implementations on flat binaries. If you can write a general purpose Nx.conv/3 implementation on a nested list, please contact me so I can learn your ways.
如您所见,张量不是列表。张量是一种旨在做好一件事的数据结构:紧缩数字。列表更通用。虽然我毫不怀疑您可能会执行 Nx API 在嵌套列表上实现的相同操作,但这将是一场噩梦 — 甚至比在平面二进制文件上编写实现更糟糕。如果您可以在嵌套列表上编写通用的 Nx.conv/3 实现,请与我联系,以便我学习您的方法。

Creating Tensors 创建张量

Now you know what a tensor is, but how can you create one? You’ve already seen one way in this post: using the Nx.tensor/2 factory function. Nx.tensor/2 provides a simple interface for creating tensors with values that are know in advance:
现在你知道张量是什么了,但你如何创建张量呢?您已经在这篇文章中看到了一种方法:使用 Nx.tensor/2 工厂函数。 Nx.tensor/2 提供了一个简单的接口,用于创建具有预先知道的值的张量:

Nx.tensor([[1, 2, 3]])

        
          
        
      
#Nx.Tensor<
  s64[1][3]
  [
    [1, 2, 3]
  ]
>

You can also specify the :type and :names of the tensor:
您还可以指定张量的 :type:names

Nx.tensor([[1, 2, 3], [4, 5, 6]], type: {:f, 64}, names: [:x, :y])

        
          
        
      
#Nx.Tensor<
  f64[x: 2][y: 3]
  [
    [1.0, 2.0, 3.0],
    [4.0, 5.0, 6.0]
  ]
>

As you’ve seen already, if you don’t specify a type, Nx will infer the type from the highest type in the tensor:
正如您已经看到的,如果您不指定类型, Nx 将从张量中的最高类型推断类型:

Nx.tensor([[1.0, 2, 3]])

        
          
        
      
#Nx.Tensor<
  f32[1][3]
  [
    [1.0, 2.0, 3.0]
  ]
>

If you need more complex creation routines, Nx offers a number of them. For example, you can create tensors of random uniform and normal values:
如果您需要更复杂的创建例程, Nx 提供了其中的一些。例如,您可以创建随机均匀值和正常值的张量:

# between 0 and 10
Nx.random_uniform({2, 2}, 0, 10, type: {:s, 32})

        
          
        
      
#Nx.Tensor<
  s32[2][2]
  [
    [1, 0],
    [3, 9]
  ]
>
# mean 0, variance 2
Nx.random_normal({2, 2}, 0.0, 2.0)

        
          
        
      
#Nx.Tensor<
  f32[2][2]
  [
    [-0.7820066213607788, -0.37923309206962585],
    [-0.04907086119055748, -2.698871374130249] 
  ]
>

You can also fill a tensor with a constant value:
您还可以用常量值填充张量:

Nx.broadcast(1, {2, 2})

        
          
        
      
#Nx.Tensor<
  s64[2][2]
  [
    [1, 1],
    [1, 1]
  ]
>

Or create a tensor which counts along an axis:
或者创建一个沿轴计数的张量:

Nx.iota({5})

        
          
        
      
#Nx.Tensor<
  s64[5]
  [0, 1, 2, 3, 4]
>
Nx.iota({5, 5}, axis: 0)

        
          
        
      
#Nx.Tensor<
  s64[5][5]
  [
    [0, 0, 0, 0, 0],
    [1, 1, 1, 1, 1],
    [2, 2, 2, 2, 2],
    [3, 3, 3, 3, 3],
    [4, 4, 4, 4, 4]
  ]
>

You can also create tensors from Elixir binaries:
您还可以从 Elixir 二进制文件创建张量:

Nx.from_binary(<<1::64-native, 2::64-native, 3::64-native>>, {:s, 64})

        
          
        
      
#Nx.Tensor<
  s64[3]
  [1, 2, 3]
>

Remember, be aware of endianness!
记住,要注意字节顺序!

Nx.from_binary(<<1::64, 2::64, 3::64>>, {:s, 64})

        
          
        
      
#Nx.Tensor<
  s64[3]
  [72057594037927936, 144115188075855872, 216172782113783808]
>

Finally, if you’re coming over from NumPy, you can load tensors in from NumPy files:
最后,如果你是从 NumPy 过来的,你可以从 NumPy 文件中加载张量:

Nx.from_numpy("path/to/numpy.npy")

        
          
        
      

Or NumPy archives:
或者 NumPy 档案:

Nx.from_numpy("path/to/numpy_archive.npz")

        
          
        
      

When you start working with Nx, you’ll quickly realize a lot of your time is spent trying to get your data into a tensor. Right now the ecosystem has relatively good support for creating tensors from images (stb_image and nx_evision) and structured data (Explorer), but lacks in other areas such as text, audio, signals, and videos. If you’d like to see any of these bumped up in priority, feel free to reach out with your use case.
当你开始使用 Nx 时,你会很快意识到你的很多时间都花在了试图将数据转化为张量上。目前,生态系统对从图像(stb_image 和 nx_evision)和结构化数据(Explorer)创建张量提供了相对较好的支持,但在文本、音频、信号和视频等其他领域缺乏支持。如果您希望看到这些中的任何一个优先级提高,请随时联系您的用例。

Manipulating Tensor Shapes
操纵张量形状

So now you have a tensor, but it’s in the wrong shape! Can you change it? Yes! Nx has a number of shape manipulation functions in its API. Let’s look at a few:
所以现在你有了一个张量,但它的形状不对!你能改变它吗?是的! Nx 在其 API 中有许多形状操作函数。让我们看一些:

Reshape 重塑

The simplest shape manipulation you might want to do is a basic reshape:
您可能想要做的最简单的形状操作是基本的重塑:

Nx.tensor([[1, 2, 3], [4, 5, 6]])
|> Nx.reshape({6})

        
          
        
      
#Nx.Tensor<
  s64[6]
  [1, 2, 3, 4, 5, 6]
>

Reshaping is a constant-time operation—it only actually changes the shape property of a tensor. Remember, the data itself is still a flat binary. You are only changing the “view” of the data.
重塑是一个恒定时间的操作 —— 它实际上只是改变了张量的形状属性。请记住,数据本身仍然是平面二进制文件。您只是在更改数据的“视图”。

When you’re reshaping, the “size” of the tensor must stay the same. You can’t reshape a tensor with shape {2, 3} to a tensor with shape {8}. You can also Nx.reshape to change a tensor’s names:
重塑时,张量的“大小”必须保持不变。你不能将形状为 {2, 3} 的张量重塑为形状为 {8} 的张量。你也可以 Nx.reshape 来改变张量的名字:

Nx.tensor([[1, 2, 3], [4, 5, 6]])
|> Nx.reshape({2, 3}, [:x, :y])

        
          
        
      
#Nx.Tensor<
  s64[2][3]
  [
    [1, 2, 3],
    [4, 5, 6]
  ]
>

Transpose 移调

While messing around with Nx.reshape, you might have attempted to permute the dimensions of the tensor with something like:
在摆弄 Nx.reshape 时,您可能已经尝试用类似以下内容来置换张量的维度:

Nx.tensor([[1, 2, 3], [4, 5, 6]])
|> Nx.reshape({3, 2})

        
          
        
      
#Nx.Tensor<
  s64[3][2]
  [
    [1, 2],
    [3, 4],
    [5, 6]
  ]
>

only to be surprised by the result. What you actually wanted to do was transpose the tensor:
只是对结果感到惊讶。你真正想做的是转置张量:

Nx.transpose(Nx.tensor([[1, 2, 3], [4, 5, 6]]))

        
          
        
      
#Nx.Tensor<
  s64[3][2]
  [
    [1, 4],
    [2, 5],
    [3, 6]
  ]
>

Transposing a tensor reorders the dimensions of the tensor according to the permutation you give it. It’s easier to see this happening with named tensors:
转置张量会根据您提供的排列对张量的维度进行重新排序。使用命名张量更容易看到这种情况:

Nx.tensor([[[1], [2]], [[3], [4]]], names: [:x, :y, :z])
|> Nx.transpose(axes: [:z, :x, :y])

        
          
        
      
#Nx.Tensor<
  s64[z: 1][x: 2][y: 2]
  [
    [
      [1, 2],
      [3, 4]
    ]
  ]
>

Notice how dimension :z is now where dimension :x was, dimension :x is now where dimension :y was, and dimension :y is now where dimension :z was.
请注意维度 :z 现在是维度 :x 的位置,维度 :x 现在是维度 :y 的位置,维度 :y 现在是维度 :z 的位置。

Adding and Squeezing Axes 添加和挤压轴

If you have a tensor with a “scalar” shape {}, and you want to give it some dimensionality, you can use Nx.new_axis:
如果你有一个“标量”形状的张量 {} ,你想给它一些维度,你可以使用 Nx.new_axis

Nx.tensor(1)
|> Nx.new_axis(-1, :baz)
|> Nx.new_axis(-1, :bar)
|> Nx.new_axis(-1, :foo)

        
          
        
      
#Nx.Tensor<
  s64[baz: 1][bar: 1][foo: 1]
  [
    [
      [1]
    ]
  ]
>

Nx.new_axis/3 will insert a new axis in the given position (-1 means at the end of the shape) with the given name. The new axis is always size 1. Alternatively, you might want to get rid of 1 sized dimensions. You can do this with Nx.squeeze:
Nx.new_axis/3 将使用给定的名称在给定位置插入一个新轴( -1 表示在形状的末尾)。新轴的尺寸始终为 1 。或者,您可能想要摆脱 1 大小的维度。你可以用 Nx.squeeze 来做到这一点:

Nx.tensor([[[[[[[[1]]]]]]]])
|> Nx.squeeze()

        
          
        
      
#Nx.Tensor<
  s64
  1
>

Nx.squeeze/1 will “squeeze out” any 1-sized dimensions in the tensor.
Nx.squeeze/1 将“挤出”张量中任何尺寸为 1 的维度。

You might be thinking, “What’s so special about these functions? Couldn’t I have just reshaped the tensors?” Absolutely. As a matter of fact, all of these functions are built on top of an Nx.reshape operation. However, using Nx.new_axis and Nx.squeeze is a much better illustration of your intent than simply reshaping.
你可能会想,“这些功能有什么特别之处?我不能重塑张量吗?”绝对地。事实上,所有这些功能都建立在 Nx.reshape 操作之上。但是,使用 Nx.new_axisNx.squeeze 比简单地重塑更能说明您的意图。

Manipulating Tensor Types
操纵张量类型

On top of manipulating shape, you might want to manipulate a tensor’s type. There are two methods which allow you to do this: Nx.as_type and Nx.bitcast.
除了操纵形状之外,您可能还想操纵张量的类型。有两种方法可以让您执行此操作: Nx.as_typeNx.bitcast

Nx.as_type does an element-wise type conversion:
Nx.as_type 进行逐元素类型转换:

Nx.tensor([[1.0, 2.0, -3.0], [4.0, 5.0, 6.0]])
|> Nx.as_type({:s, 64})

        
          
        
      
#Nx.Tensor<
  s64[2][3]
  [
    [1, 2, -3],
    [4, 5, 6]
  ]
>

You should note that if you are “downcasting”, this conversion can result in underflow, overflow, or a loss of precision and cause some hard-to-debug issues:
您应该注意,如果您正在“向下转换”,此转换可能会导致下溢、溢出或精度损失,并导致一些难以调试的问题:

Nx.tensor([[1.6, 2.8, -1.2], [3.5, 2.3, 3.2]])
|> Nx.as_type({:u, 8})

        
          
        
      
#Nx.Tensor<
  u8[2][3]
  [
    [1, 2, 255],
    [3, 2, 3]
  ]
>

The Nx.as_type operation returns entirely new bytes for the underlying tensor data. Alternatively, Nx.bitcast just returns a new “view” of the tensor data. Rather than interpreting the bytes as {:f, 32}, you might want to interpret them as {:s, 64}. This means a bitcast is also a constant-time operation, but there are no guarantees about its behavior.
Nx.as_type 操作为底层张量数据返回全新的字节。或者, Nx.bitcast 只返回张量数据的新“视图”。您可能不想将字节解释为 {:f, 32} ,而是将它们解释为 {:s, 64} 。这意味着 bitcast 也是一个常量时间操作,但不能保证其行为。

Nx.tensor([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])
|> Nx.bitcast({:s, 32})

        
          
        
      
#Nx.Tensor<
  s32[2][3]
  [
    [1065353216, 1073741824, 1077936128],
    [1082130432, 1084227584, 1086324736]
  ]
>

Basic Tensor Operations
基本张量运算

So you’ve created some tensors, got them in the right shape and type, and now you want to do something with them. But, what can you actually do? A lot!
所以你已经创建了一些张量,让它们具有正确的形状和类型,现在你想用它们做点什么。但是,您实际上可以做什么?很多!

The most basic operations you can perform on tensors are element-wise unary operations. These operations “loop” through each value in the tensor and perform some mathematical operation on the value to return a new value in its place. For example, you can compute the element-wise exponential with Nx.exp:
您可以对张量执行的最基本的操作是按元素的一元操作。这些操作“循环”遍历张量中的每个值并对值执行一些数学运算以返回一个新值来代替它。例如,您可以使用 Nx.exp 计算逐元素指数:

Nx.exp(Nx.tensor([1, 2, 3]))

        
          
        
      
#Nx.Tensor<
  f32[3]
  [2.7182817459106445, 7.389056205749512, 20.08553695678711]
>

Or you can compute element-wise trigonometric functions:
或者您可以计算逐元素三角函数:

Nx.sin(Nx.tensor([1, 2, 3]))

        
          
        
      
#Nx.Tensor<
  f32[3]
  [0.8414709568023682, 0.9092974066734314, 0.14112000167369843]
>
Nx.cos(Nx.tensor([1, 2, 3]))

        
          
        
      
#Nx.Tensor<
  f32[3]
  [0.5403022766113281, -0.416146844625473, -0.9899924993515015]
>
Nx.tan(Nx.tensor([1, 2, 3]))

        
          
        
      
#Nx.Tensor<
  f32[3]
  [1.5574077367782593, -2.185039758682251, -0.14254654943943024]
>

Or even an element-wise natural log:
甚至是元素方面的自然对数:

Nx.log(Nx.tensor([1, 2, 3]))

        
          
        
      
#Nx.Tensor<
  f32[3]
  [0.0, 0.6931471824645996, 1.0986123085021973]
>

If it helps to think of these functions as an Enum.map/2 then you can, just remember that tensors are not lists and do not implement the Enumerable protocol. The element-wise implementations on tensors are more efficient than calling an Enum.map. This is because if you’re using a compiler or backend, you’ll be able to take advantage of specialized kernels which target the CPU or GPU. Additionally, as the dimensionality of your tensor increases, so to would the “nesting” of your Enum.map/2 implementation which attempts to mimic the element-wise operation:
如果将这些函数视为 Enum.map/2 有帮助,那么您可以,只要记住张量不是列表并且不实现 Enumerable 协议。张量上的逐元素实现比调用 Enum.map 更有效。这是因为如果您使用编译器或后端,您将能够利用针对 CPU 或 GPU 的专用内核。此外,随着张量维数的增加, Enum.map/2 实现的“嵌套”也会增加,它试图模仿逐元素操作:

Nx.sin(Nx.tensor([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]]))

        
          
        
      
#Nx.Tensor<
  f32[2][2][3]
  [
    [
      [0.8414709568023682, 0.9092974066734314, 0.14112000167369843],
      [-0.756802499294281, -0.9589242935180664, -0.279415488243103]
    ],
    [
      [0.6569865942001343, 0.9893582463264465, 0.41211849451065063],
      [-0.5440211296081543, -0.9999902248382568, -0.5365729331970215]
    ]
  ]
>

You should never write code that loops through the values in a tensor to perform operations on individual elements. Instead, you should write those kinds of operations in terms of the existing element-wise unary functions.
您永远不应该编写循环遍历张量中的值以对单个元素执行操作的代码。相反,您应该根据现有的逐元素一元函数来编写这些类型的操作。

Broadcasting 广播

On top of Nx supporting unary operators, you can also perform a number of binary operations on tensors such as add, subtract, multiply, and divide:
除了支持一元运算符的 Nx 之外,您还可以对张量执行许多二元运算,例如 addsubtractmultiplydivide

Nx.add(Nx.tensor([1, 2, 3]), Nx.tensor([4, 5, 6]))

        
          
        
      
#Nx.Tensor<
  s64[3]
  [5, 7, 9]
>
Nx.multiply(Nx.tensor([1, 2, 3]), Nx.tensor([4, 5, 6]))

        
          
        
      
#Nx.Tensor<
  s64[3]
  [4, 10, 18]
>

These binary operations work element-wise. The values between the two tensors are zipped and added, multiplied, subtracted, etc. But, what happens if you encounter a situation where the shapes of the tensors don’t match? Nx will attempt to broadcast them.
这些二进制操作按元素工作。两个张量之间的值被压缩和相加、相乘、相减等。但是,如果遇到张量的形状不匹配的情况会怎样? Nx 将尝试广播它们。

Recall from the creation examples that we used Nx.broadcast to fill a tensor with a constant value. What Nx.broadcast actually does is attempts to implement Nx’s broadcasting semantics. A tensor can be broadcasted to a certain shape if:
回想一下我们使用 Nx.broadcast 以常量值填充张量的创建示例。 Nx.broadcast 实际做的是尝试实现 Nx 的广播语义。如果满足以下条件,则可以将张量广播到特定形状:

  1. It is a scalar shape {}, OR
    它是一个标量形状 {} ,或者
  2. The size of each dimension in the tensor matches the corresponding size of each dimension in the target shape OR,
    张量中每个维度的大小与目标形状中每个维度的相应大小匹配或,
  3. The size of dimensions which do not match the target shape are size 1
    尺寸与目标形状不匹配的尺寸为尺寸 1

Broadcasting gives us a way to efficiently “repeat” values without consuming any additional memory. For example, say you have two tensors with shapes {1, 1000} and {1000, 1000} respectively. You want to add the first tensor to the second tensor, but repeating the 1000 elements in its second dimension across the first dimension of the second tensor. Broadcasting allows you to accomplish this, without explicitly repeating values yourself:
广播为我们提供了一种有效地“重复”值而不消耗任何额外内存的方法。例如,假设您有两个张量,形状分别为 {1, 1000}{1000, 1000} 。您想要将第一个张量添加到第二个张量,但在第二个张量的第一个维度上重复其第二个维度中的 1000 个元素。广播允许您完成此操作,而无需自己明确重复值:

# {1, 3}
a = Nx.tensor([[1, 2, 3]])
b = Nx.tensor([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
Nx.add(a, b)

        
          
        
      
#Nx.Tensor<
  s64[3][3]
  [
    [2, 4, 6],
    [5, 7, 9],
    [8, 10, 12]
  ]
>

Notice how in the above example [1, 2, 3] is added to [1, 2, 3], [4, 5, 6], and [7, 8, 9] in the second tensor. We didn’t need to do anything! Nx took care of the repetition for us!
请注意在上面的示例中, [1, 2, 3] 是如何添加到第二个张量中的 [1, 2, 3][4, 5, 6][7, 8, 9] 的。我们不需要做任何事情! Nx 为我们处理了重复!

Aggregates 骨料

In a previous example, you saw how you could compute a grayscale image by calculating the maximum value along an image’s channels. In that example, we used the reduce_max aggregate function. Nx offers a number of other aggregates for computing the sum, product, min, and mean of a tensor along an axis.
在前面的示例中,您了解了如何通过沿图像通道计算最大值来计算灰度图像。在该示例中,我们使用了 reduce_max 聚合函数。 Nx 提供了许多其他聚合,用于计算张量沿轴的 sumproductminmean

To understand how these aggregates work, let’s consider Nx.sum. You can use Nx.sum to compute the sum of all values in the tensor:
要了解这些聚合的工作原理,让我们考虑一下 Nx.sum 。您可以使用 Nx.sum 来计算张量中所有值的总和:

Nx.sum(Nx.tensor([[1, 2, 3], [4, 5, 6]]))

        
          
        
      
#Nx.Tensor<
  s64
  21
>

Or you can use it to compute the sum along an axis. It’s easy to see what this looks like with named tensors:
或者您可以使用它来计算沿轴的总和。使用命名张量很容易看出它的样子:

Nx.sum(Nx.tensor([[1, 2, 3], [4, 5, 6]], names: [:x, :y]), axes: [:y])

        
          
        
      
#Nx.Tensor<
  s64[x: 2]
  [6, 15]
>

Notice how the :y axis “disappears”. You reduce the axis away by summing all of the elements away. You can see this contraction a little better if we keep the axes:
注意 :y 轴是如何“消失”的。您可以通过将所有元素相加来减少轴距。如果我们保留轴,您可以更好地看到这种收缩:

Nx.sum(Nx.tensor([[1, 2, 3], [4, 5, 6]], names: [:x, :y]), axes: [:y], keep_axes: true)

        
          
        
      
#Nx.Tensor<
  s64[x: 2][y: 1]
  [
    [6],
    [15]
  ]
>
Nx.sum(Nx.tensor([[1, 2, 3], [4, 5, 6]], names: [:x, :y]), axes: [:x], keep_axes: true)

        
          
        
      
#Nx.Tensor<
  s64[x: 1][y: 3]
  [
    [5, 7, 9]
  ]
>

Immutability 不变性

One final detail that is essential for you to understand about Nx is that tensors are immutable. You cannot change the underlying data of an object. All of the operations you saw illustrated today return new tensors with brand new data. This might lead you to think, “But wait, isn’t that really inefficient? We’re copying really large binaries every time!” The answer, at least for the Elixir implementations of tensors, is yes. Fortunately though, Nx offers pluggable backends and compilers which stage out calculations to external libraries with efficient implementations of these operations. Immutability is actually a huge benefit when constructing graphs for just-in-time compilation and automatic differentiation.
关于 Nx 的最后一个重要细节是张量是不可变的。您不能更改对象的基础数据。您今天看到的所有说明操作都会返回带有全新数据的新张量。这可能会让你想,“但是等等,这不是很低效吗?我们每次都在复制非常大的二进制文件!”至少对于张量的 Elixir 实现,答案是肯定的。幸运的是, Nx 提供了可插拔的后端和编译器,它们可以通过有效地实现这些操作来将计算分阶段转移到外部库。在为即时编译和自动微分构建图形时,不变性实际上是一个巨大的好处。

We will cover graph construction in a future post. For now, just understand that in Nx we don’t ever change the contents of a tensor, we just call functions which return new tensors.
我们将在以后的文章中介绍图形构建。现在,只需了解在 Nx 中我们永远不会更改张量的内容,我们只是调用返回新张量的函数。

Conclusion 结论

This post was designed to be an introduction to Nx for those who haven’t worked with machine learning or numerical computing before. I hope now you feel a little more comfortable with the idea of a tensor and working with some of the functions in the Nx API. If you want to see tensors in action, I recommend checking out some of my previous posts, and staying tuned in for future posts with applications of Nx to the real world.
这篇文章旨在为那些以前没有接触过机器学习或数值计算的人介绍 Nx 。我希望现在您对张量的概念和使用 Nx API 中的一些函数感觉更舒服一些。如果您想了解张量的实际应用,我建议您查看我之前的一些帖子,并继续关注未来将 Nx 应用于现实世界的帖子。

Until next time! 直到下一次!

Catching Fraud with Elixir and Axon
使用 Elixir 和 Axon 捕捉欺诈

Hand holding credit card while using laptop

Machine Learning Advisor
机器学习顾问

Sean Moriarity  肖恩·莫里亚蒂

Introduction 介绍

Fraud and identity theft affect millions of people every year. Credit card fraud, in which an individual makes an unauthorized payment using a credit card, debit card, or through other means, is just one common form.
欺诈和身份盗窃每年影响数百万人。个人使用信用卡、借记卡或其他方式进行未经授权的付款的信用卡欺诈只是一种常见形式。

There are around 1 billion credit card transactions every day—so fraud detection systems need to be able to process thousands of transactions per second. Additionally, the cost of fraud can be high. Downtime is not an option. Fraud detection systems must be built with scalability and fault tolerance in mind—both of which are strengths of Elixir as a programming language.
每天大约有 10 亿笔信用卡交易,因此欺诈检测系统需要能够每秒处理数千笔交易。此外,欺诈的成本可能很高。停机不是一种选择。欺诈检测系统的构建必须考虑可扩展性和容错性 —— 这两者都是 Elixir 作为编程语言的优势。

A fraud detection system also needs to be operationally reliable. It must correctly tag fraudulent transactions and avoid overflagging normal transactions or risk losing customer patience and trust. Given the sheer volume of transactions and the amount of data associated with each transaction, it would be impossible to detect fraudulent transactions by hand. Financial transactions come with a rich set of features and an abundance of examples of transactions. It is the perfect application of machine learning.
欺诈检测系统还需要操作可靠。它必须正确标记欺诈交易,避免过度标记正常交易或冒失去客户耐心和信任的风险。考虑到交易量和与每笔交易相关的数据量,人工检测欺诈交易是不可能的。金融交易具有丰富的功能和丰富的交易示例。它是机器学习的完美应用。

With Axon, you can marry the operational strengths of Elixir with the functional strengths of deep learning to design a scalable, fault-tolerant, and precise fraud detection system. In this post, you’ll learn how to use Axon to design a model to catch fraudulent transactions.
借助 Axon,您可以将 Elixir 的运营优势与深度学习的功能优势结合起来,设计出可扩展、容错且精确的欺诈检测系统。在本文中,您将学习如何使用 Axon 设计模型来捕获欺诈交易。

Installation 安装

You’ll need to install Axon to create and train a model for fraud detection, EXLA for hardware acceleration, Nx for manipulating transaction data, and Explorer for parsing and slicing CSV data.
您需要安装 Axon 来创建和训练用于欺诈检测的模型,用于硬件加速的 EXLA,用于操作交易数据的 Nx,以及用于解析和切片 CSV 数据的 Explorer。

Mix.install([
  {:axon, "~> 0.1.0-dev", github: "elixir-nx/axon"},
  {:exla, "~> 0.1.0-dev", github: "elixir-nx/nx", sparse: "exla"},
  {:nx, "~> 0.1.0-dev", github: "elixir-nx/nx", sparse: "nx", override: true},
  {:explorer, "~> 0.1.0-dev", github: "elixir-nx/explorer"}
])

        
          
        
      

The Data 数据

The data for this example can be downloaded on Kaggle. It consists of around 300,000 European card transactions, of which only around 500 are fraudulent. We’ll need to be cognizant of this massive imbalance in our data when designing and evaluating our model. If our model marks every transaction as legitimate, it would still achieve 99% accuracy!
这个例子的数据可以在 Kaggle 上下载。它包含大约 300,000 笔欧洲银行卡交易,其中只有大约 500 笔是欺诈交易。在设计和评估我们的模型时,我们需要意识到数据中的这种巨大不平衡。如果我们的模型将每笔交易都标记为合法,它仍然可以达到 99% 的准确率!

In this dataset, the features are extracted from real-life features using Principal Component Analysis (PCA) in order to anonymize and protect sensitive user information. In a real system, you would do this kind of feature extraction yourself from transaction features such as amount, location, vendor, etc. In an end-to-end system, you would start by using Explorer to conduct an Exploratory Data Analysis (EDA), and then determine an appropriate set of features for a model you want to test.
在此数据集中,使用主成分分析 (PCA) 从现实生活中的特征中提取特征,以匿名化和保护敏感的用户信息。在真实系统中,您会自己从金额、位置、供应商等交易特征中提取这种特征。在端到端系统中,您会首先使用 Explorer 进行探索性数据分析 (EDA) ),然后为您要测试的模型确定一组合适的特征。

Start by downloading the data to a local directory and loading the CSV using Explorer:
首先将数据下载到本地目录并使用资源管理器加载 CSV:

df = Explorer.DataFrame.read_csv!("creditcard.csv", dtypes: [{"Time", :float}])

        
          
        
      

Now, split your data into a train set and a test set. Splitting into train and test sets is important to validate that your model does not overfit to specific examples. You should notice above that the data is ordered temporally—the time of transaction increases from the first example to the last.
现在,将您的数据分成训练集和测试集。拆分成训练集和测试集对于验证您的模型不会过度拟合特定示例非常重要。您应该注意到上面的数据是按时间排序的 —— 交易时间从第一个示例到最后一个示例增加。

By taking a fixed number of examples from the end of the dataset, you are marking all transactions before a certain time as train and all transactions after a certain time as test. This is important to note because it could be a form of bias in your dataset. If your train set time window is not sufficiently representative of transactions, you will end up with a model which performs poorly. In a real system, you’d likely want to extend this train window over the course of several days. During data analysis you would want to determine what “normal” data should look like, and ensure both your train and test set are representative of that “normal.”
通过从数据集末尾获取固定数量的示例,您将特定时间之前的所有交易标记为训练,并将特定时间之后的所有交易标记为测试。这一点很重要,因为它可能是您数据集中的一种偏差形式。如果你的训练集时间窗口不能充分代表交易,你最终会得到一个表现不佳的模型。在真实系统中,您可能希望在几天内延长这个训练窗口。在数据分析期间,您可能想要确定“正常”数据应该是什么样子,并确保您的训练集和测试集都代表该“正常”数据。

num_examples = Explorer.DataFrame.n_rows(df)
num_train = ceil(0.85 * num_examples)
num_test = num_examples - num_train

train_df = Explorer.DataFrame.slice(df, 0, num_train)
test_df = Explorer.DataFrame.slice(df, num_train, num_test)

        
          
        
      

This code takes the first 85% of examples for training, and leaves the last 15% of examples for testing.
此代码使用前 85% 的示例进行训练,并保留后 15% 的示例进行测试。

Next, you’ll need to split both train and test sets into sets of features and sets of targets. If you recall from my previous Axon article, Axon requires examples to consist of tuples of {features, targets} to train a model. Your target for this example is Class - all other columns are features:
接下来,您需要将训练集和测试集分成特征集和目标集。如果您还记得我之前的 Axon 文章,Axon 需要包含 {features, targets} 元组的示例来训练模型。此示例的目标是 Class - 所有其他列都是功能:

x_train_df = Explorer.DataFrame.select(train_df, &(&1 == "Class"), :drop)
y_train_df = Explorer.DataFrame.select(train_df, &(&1 == "Class"), :keep)
x_test_df = Explorer.DataFrame.select(test_df, &(&1 == "Class"), :drop)
y_test_df = Explorer.DataFrame.select(test_df, &(&1 == "Class"), :keep)

        
          
        
      

Notice how each of your examples is currently an Explorer DataFrame. Axon doesn’t understand how to work with DataFrames. Instead, you need to convert your data into Nx tensors which can be passed into an Axon training loop.
请注意您的每个示例当前如何成为 Explorer DataFrame。 Axon 不了解如何使用 DataFrame。相反,您需要将数据转换为 Nx 张量,这些张量可以传递到 Axon 训练循环中。

to_tensor = fn df ->
  df
  |> Explorer.DataFrame.names()
  |> Enum.map(&(Explorer.Series.to_tensor(df[&1]) |> Nx.new_axis(-1)))
  |> Nx.concatenate(axis: 1)
end

        
          
        
      

The function above is a bit verbose, but it gets the job done. There is an active issue for adding a native to_tensor function for Explorer DataFrames. Contributions are welcome!
上面的函数有点冗长,但它完成了工作。为 Explorer DataFrames 添加本机 to_tensor 函数存在一个活跃问题。欢迎投稿!

x_train = to_tensor.(x_train_df)
y_train = to_tensor.(y_train_df)
x_test = to_tensor.(x_test_df)
y_test = to_tensor.(y_test_df)

        
          
        
      

You now have four large tensors representing the entirety of train and test sets. Axon requires training in minibatches which means you pass some number of examples to a single training step, update the model, and move on to the next minibatch. Each example is a tuple: {features, targets} where features is a batched tensor of example features and targets is a batched tensor of example targets. You must pass examples in a data structure that implements the Enumerable protocol. It’s most common to use a Stream to lazily load data into an Axon training loop.
您现在有四个大张量代表整个训练集和测试集。 Axon 需要小批量训练,这意味着您将一些示例传递到单个训练步骤,更新模型,然后继续下一个小批量。每个示例都是一个元组: {features, targets} ,其中 features 是示例特征的批处理张量, targets 是示例目标的批处理张量。您必须在实现 Enumerable 协议的数据结构中传递示例。最常见的做法是使用 Stream 将数据延迟加载到 Axon 训练循环中。

batched_train_inputs = Nx.to_batched_list(x_train, 2048)
batched_train_targets = Nx.to_batched_list(y_train, 2048)
batched_train = Stream.zip(batched_train_inputs, batched_train_targets)

batched_test_inputs = Nx.to_batched_list(x_test, 2048)
batched_test_targets = Nx.to_batched_list(y_test, 2048)
batched_test = Stream.zip(batched_test_inputs, batched_test_targets)

        
          
        
      

batched_train and batched_test will lazily return the target-feature tuples required to train and evaluate your Axon model. With your data prepared for training, it’s time to implement the model.
batched_trainbatched_test 将延迟返回训练和评估 Axon 模型所需的目标特征元组。为训练准备好数据后,就可以实施模型了。

Before training, there is one final step needed to maximize the performance of your model. You’ll want to normalize the input data such that each column is on a common scale between zero and one. You can achieve this in a number of ways. For this example, you’ll scale by dividing each feature by the max feature value in the training data:
在训练之前,需要最后一步来最大化模型的性能。您需要规范化输入数据,使每一列都处于 0 和 1 之间的通用范围内。您可以通过多种方式实现这一目标。对于此示例,您将通过将每个特征除以训练数据中的最大特征值来进行缩放:

train_max = Nx.reduce_max(x_train, axes: [0], keep_axes: true)

normalize = fn {batch, target} ->
  {Nx.divide(batch, train_max), target}
end

batched_train = batched_train |> Stream.map(&Nx.Defn.jit(normalize, [&1], compiler: EXLA))
batched_test = batched_test |> Stream.map(&Nx.Defn.jit(normalize, [&1], compiler: EXLA))

        
          
        
      

The Model 该模型

Fraud detection is a well-researched area of machine learning. There are entire books on the subject. There are a number of models you could choose from. In an end-to-end example, you would want to experiment with many different types of models such as simple regression, decision trees, neural networks, etc. In this example, you’ll just implement one such model. If you’d like to experiment with more, check out the new Scholar package which is intended to include a number of machine learning estimators that can be applied to problems such as this.
欺诈检测是机器学习的一个深入研究领域。有关于这个主题的整本书。您可以选择多种型号。在端到端的示例中,您可能想要试验许多不同类型的模型,例如简单回归、决策树、神经网络等。在这个示例中,您将只实现一个这样的模型。如果您想尝试更多,请查看新的 Scholar 包,该包旨在包含许多可应用于此类问题的机器学习估计器。

The fraud detection data you have is structured meaning it’s in a tabular form where each value in a row represents some target feature. You can naturally train a small feed-forward neural network to classify transactions as fraudulent or legitimate. Your model should output a probability between zero and one, with probabilities closer to one indicating a fraudulent transaction. Because you have a relatively small input feature space, you can settle for relatively small intermediate layers. In this example, you can get away with three hidden layers with a hidden size of 256. 256 is the dimensionality of each individual hidden layer. Feel free to experiment with an architecture different from the one seen here. For example, you might want to try different activations, hidden sizes, and dropout configurations.
您拥有的欺诈检测数据是结构化的,这意味着它采用表格形式,其中一行中的每个值代表一些目标特征。你可以自然地训练一个小型前馈神经网络来将交易分类为欺诈或合法。您的模型应输出介于 0 和 1 之间的概率,接近 1 的概率表示存在欺诈交易。因为你有一个相对较小的输入特征空间,你可以满足于相对较小的中间层。在此示例中,您可以使用隐藏大小为 256 的三个隐藏层。256 是每个单独隐藏层的维数。随意尝试与此处所见架构不同的架构。例如,您可能想尝试不同的激活、隐藏大小和丢弃配置。

While a neural network is a suitable model for this application, there are a few reasons you might not want to use a neural network in practice. For example, you might want a more interpretable model which can help you explain why certain transactions were marked as fraudulent. Additionally, you might find that simpler models achieve comparable performance with less compute. All of these factors would need to be considered in an end-to-end example where you train multiple types of models for the same problem. Choosing the correct model requires reasoning about functional and operational requirements of the overall system.
虽然神经网络是适合此应用程序的模型,但您可能出于某些原因不想在实践中使用神经网络。例如,您可能需要一个更易于解释的模型,它可以帮助您解释为什么某些交易被标记为欺诈。此外,您可能会发现更简单的模型以更少的计算实现可比的性能。在端到端示例中需要考虑所有这些因素,在该示例中,您针对同一问题训练多种类型的模型。选择正确的模型需要对整个系统的功能和操作要求进行推理。

model =
  Axon.input({nil, 30})
  |> Axon.dense(256)
  |> Axon.relu()
  |> Axon.dense(256)
  |> Axon.relu()
  |> Axon.dropout(rate: 0.3)
  |> Axon.dense(256)
  |> Axon.relu()
  |> Axon.dropout(rate: 0.3)
  |> Axon.dense(1)
  |> Axon.sigmoid()

        
          
        
      

Your model will look like this:
您的模型将如下所示:

------------------------------------------------------------------------------------------------------
                                                Model
======================================================================================================
 Layer                                Shape        Policy              Parameters   Parameters Memory
======================================================================================================
 input_0 ( input )                    {nil, 30}    p=f32 c=f32 o=f32   0            0 bytes
 dense_0 ( dense[ "input_0" ] )       {nil, 256}   p=f32 c=f32 o=f32   7936         31744 bytes
 relu_0 ( relu[ "dense_0" ] )         {nil, 256}   p=f32 c=f32 o=f32   0            0 bytes
 dense_1 ( dense[ "relu_0" ] )        {nil, 256}   p=f32 c=f32 o=f32   65792        263168 bytes
 relu_1 ( relu[ "dense_1" ] )         {nil, 256}   p=f32 c=f32 o=f32   0            0 bytes
 dropout_0 ( dropout[ "relu_1" ] )    {nil, 256}   p=f32 c=f32 o=f32   0            0 bytes
 dense_2 ( dense[ "dropout_0" ] )     {nil, 256}   p=f32 c=f32 o=f32   65792        263168 bytes
 relu_2 ( relu[ "dense_2" ] )         {nil, 256}   p=f32 c=f32 o=f32   0            0 bytes
 dropout_1 ( dropout[ "relu_2" ] )    {nil, 256}   p=f32 c=f32 o=f32   0            0 bytes
 dense_3 ( dense[ "dropout_1" ] )     {nil, 1}     p=f32 c=f32 o=f32   257          1028 bytes
 sigmoid_0 ( sigmoid[ "dense_3" ] )   {nil, 1}     p=f32 c=f32 o=f32   0            0 bytes
------------------------------------------------------------------------------------------------------

Training the Model 训练模型

With your data prepped for training and your model defined, it’s time to train. Recall that your data is incredibly imbalanced, which means we need to account for this imbalance when updating the model. You need to penalize the model for missing fraudulent transactions. You can achieve this penalty in Axon using class weights.
为训练准备好数据并定义模型后,就可以开始训练了。回想一下,您的数据非常不平衡,这意味着我们需要在更新模型时考虑到这种不平衡。您需要对丢失欺诈交易的模型进行惩罚。您可以使用类权重在 Axon 中实现此惩罚。

In a problem where you have a balanced dataset where the number of examples per class is equal, you’d want to update your model parameters equally per class. For example, if you are classifying images of cats versus dogs with equal numbers of both classes, you’d want to update your model for failing to classify a picture of a cat proportional to how you’d update your model for failing to classify a picture of a dog.
如果您有一个平衡数据集,其中每个类的示例数量相等,您希望每个类平均更新模型参数。例如,如果您正在对猫和狗的图像进行分类,并且两个类别的数量相同,您想要更新您的模型,因为未能对猫的图片进行分类,这与您更新模型的方式成比例,因为未能对猫进行分类一只狗的照片。

With an imbalanced dataset, you want your updates to be proportional to the overall data distribution. You want to tell your model to really pay attention to low-density classes because they are much more important than common classes. Axon loss functions such as binary_cross_entropy/3 allow you to pass weight parameters that specify the importance or weight of each class. A common way to specify these weights is to make them proportional to the overall number of occurrences in the training set. In this example, you should count both positive and negative occurrences and specify the weights accordingly:
对于不平衡的数据集,您希望更新与整体数据分布成比例。你想告诉你的模型真正关注低密度类,因为它们比普通类重要得多。 binary_cross_entropy/3 等 Axon 损失函数允许您传递指定每个类的重要性或权重的权重参数。指定这些权重的一种常见方法是使它们与训练集中出现的总次数成比例。在此示例中,您应该计算正面和负面的出现次数并相应地指定权重:

fraud = Nx.sum(y_train) |> Nx.to_number()
legit = Nx.size(y_train) - fraud

loss =
  &Axon.Losses.binary_cross_entropy(
    &1,
    &2,
    negative_weight: 1 / legit,
    positive_weight: 1 / fraud,
    reduction: :mean
  )

        
          
        
      

Axon’s training loop will accept an arity-2 function as a loss function, which means you can parameterize loss functions on target-prediction pairs. Notice how the positive weight will be much larger than the negative weight because the number of fraudulent transactions is far smaller than the number of legitimate transactions. This will ensure that the penalty for incorrectly classifying a fraudulent transaction will be much larger than for a legitimate transaction.
Axon 的训练循环将接受 arity-2 函数作为损失函数,这意味着您可以在目标预测对上参数化损失函数。请注意正权重将如何远大于负权重,因为欺诈交易的数量远小于合法交易的数量。这将确保对欺诈交易进行错误分类的处罚将比对合法交易的处罚大得多。

Next, you’ll need to define an optimizer. In this example, you will use the adam optimizer with a learning rate of 0.01. The choice of optimizer and learning rate is somewhat arbitrary, though adam typically achieves decent performance. Feel free to experiment with different optimizer and learning rate configurations to see which one yields the best performance!
接下来,您需要定义一个优化器。在此示例中,您将使用学习率为 0.01adam 优化器。优化器和学习率的选择有些随意,尽管 adam 通常可以获得不错的性能。随意尝试不同的优化器和学习率配置,看看哪一个能产生最佳性能!

optimizer = Axon.Optimizers.adam(1.0e-3)

        
          
        
      

Finally, you’ll need to define and run the training loop. You’ll want to track metrics, but accuracy doesn’t make sense in this case. Remember that just classifying all transactions as legitimate will result in greater than 99% accuracy. Instead, you’ll want to keep track of precision and recall. Precision measures the proportion of positive (fraudulent) classifications that we’re accurately predicted. More concretely, precision answers the question: How often is the model correct when it says a transaction is fraudulent? Recall measures the proportion of positive (fraudulent) classifications that we’re identified correctly. More concretely, recall answers the question: How often did the model catch a fraudulent transaction?
最后,您需要定义并运行训练循环。你会想要跟踪指标,但在这种情况下准确性没有意义。请记住,仅将所有交易归类为合法交易将导致超过 99% 的准确性。相反,您需要跟踪准确率和召回率。精度衡量我们准确预测的正(欺诈)分类的比例。更具体地说,精度回答了这个问题:当模型说交易是欺诈时,模型正确的频率是多少?召回率衡量的是我们被正确识别的正面(欺诈)分类的比例。更具体地说,召回率回答了这个问题:模型多久捕获一次欺诈交易?

Axon supports precision and recall out of the box, so you can track them without issue in your training loop. To define the loop, start with the Axon.Loop.trainer/3 factory method, instrument the loop with metrics, and then call Axon.Loop.run/3. Feel free to adjust some of the loop parameters such as the number of epochs!
Axon 开箱即用地支持精确度和召回率,因此您可以在训练循环中毫无问题地跟踪它们。要定义循环,请从 Axon.Loop.trainer/3 工厂方法开始,使用指标检测循环,然后调用 Axon.Loop.run/3 。随意调整一些循环参数,例如轮数!

model_state =
  model
  |> Axon.Loop.trainer(loss, optimizer)
  |> Axon.Loop.metric(:precision)
  |> Axon.Loop.metric(:recall)
  |> Axon.Loop.run(batched_train, epochs: 30, compiler: EXLA)

        
          
        
      

During training, you will see:
在培训期间,您将看到:

Epoch: 0, Batch: 100, loss: 0.0000036 precision: 0.0453421 recall: 0.6534294
Epoch: 1, Batch: 100, loss: 0.0000025 precision: 0.0855192 recall: 0.7102020
Epoch: 2, Batch: 100, loss: 0.0000021 precision: 0.0726182 recall: 0.7330216
Epoch: 3, Batch: 100, loss: 0.0000018 precision: 0.0694896 recall: 0.7416568
Epoch: 4, Batch: 100, loss: 0.0000017 precision: 0.0664449 recall: 0.7444073
Epoch: 5, Batch: 100, loss: 0.0000016 precision: 0.0666172 recall: 0.7664092
Epoch: 6, Batch: 100, loss: 0.0000015 precision: 0.0627265 recall: 0.7720198
Epoch: 7, Batch: 100, loss: 0.0000015 precision: 0.0596212 recall: 0.7724599
Epoch: 8, Batch: 100, loss: 0.0000014 precision: 0.0618357 recall: 0.7760903
Epoch: 9, Batch: 100, loss: 0.0000014 precision: 0.0609748 recall: 0.7772452
Epoch: 10, Batch: 100, loss: 0.0000013 precision: 0.0494180 recall: 0.7753478
Epoch: 11, Batch: 100, loss: 0.0000013 precision: 0.0606631 recall: 0.7859913
Epoch: 12, Batch: 100, loss: 0.0000013 precision: 0.0624361 recall: 0.7868164
Epoch: 13, Batch: 100, loss: 0.0000012 precision: 0.0637707 recall: 0.8099188
Epoch: 14, Batch: 100, loss: 0.0000012 precision: 0.0655236 recall: 0.7998114
Epoch: 15, Batch: 100, loss: 0.0000012 precision: 0.0634568 recall: 0.8133428
Epoch: 16, Batch: 100, loss: 0.0000011 precision: 0.0657043 recall: 0.8067421
Epoch: 17, Batch: 100, loss: 0.0000011 precision: 0.0661718 recall: 0.7993163
Epoch: 18, Batch: 100, loss: 0.0000011 precision: 0.0702126 recall: 0.8174682
Epoch: 19, Batch: 100, loss: 0.0000011 precision: 0.0699272 recall: 0.8059170
Epoch: 20, Batch: 100, loss: 0.0000011 precision: 0.0635841 recall: 0.8119520
Epoch: 21, Batch: 100, loss: 0.0000010 precision: 0.0750699 recall: 0.8186468
Epoch: 22, Batch: 100, loss: 0.0000010 precision: 0.0725016 recall: 0.8114567
Epoch: 23, Batch: 100, loss: 0.0000010 precision: 0.0740755 recall: 0.8031117
Epoch: 24, Batch: 100, loss: 0.0000010 precision: 0.0740027 recall: 0.8257190
Epoch: 25, Batch: 100, loss: 0.0000010 precision: 0.0777240 recall: 0.8260726
Epoch: 26, Batch: 100, loss: 0.0000009 precision: 0.0769449 recall: 0.8318481
Epoch: 27, Batch: 100, loss: 0.0000009 precision: 0.0477672 recall: 0.8093942
Epoch: 28, Batch: 100, loss: 0.0000009 precision: 0.0697432 recall: 0.8230080
Epoch: 29, Batch: 100, loss: 0.0000009 precision: 0.0677453 recall: 0.8194719

The training loop returns a model state which can be used to make predictions and evaluate the model. You can see the model performs decently well during training, but we really only care about performance on the test set, so let’s see how our model does!
训练循环返回模型状态,可用于进行预测和评估模型。你可以看到模型在训练期间表现不错,但我们真的只关心测试集上的表现,所以让我们看看我们的模型表现如何!

Evaluating the Model 评估模型

First, let’s see how many examples are in the test set by inspecting the shape of our test tensor:
首先,让我们通过检查测试张量的形状来了解测试集中有多少示例:

Nx.shape(y_test)

        
          
        
      

Which will show:  这将显示:

{42721, 1}

Overall we have 42721 transactions. We can calculate how many are fraudulent by computing the sum of labels in the test set because fraudulent transactions have a class label of 1:
总的来说,我们有 42721 笔交易。我们可以通过计算测试集中标签的总和来计算有多少是欺诈交易,因为欺诈交易的类别标签为 1:

Nx.sum(y_test)

        
          
        
      

Which will show:  这将显示:

#Nx.Tensor<
  s64
  52
>

Overall, there are 52 fraudulent transactions. Now, let’s see how well our model performs at detecting these fraudulent transactions. We can track the raw metrics which go into precision and recall and map those to real-life metrics. For example, in this evaluation loop, you can mark true positives as fraudulent transactions detected, true negatives as legitimate transactions accepted, false positives as legitimate transactions declined, and false negatives as fraudulent transactions accepted:
总共有52笔欺诈交易。现在,让我们看看我们的模型在检测这些欺诈交易方面的表现如何。我们可以跟踪涉及精度和召回率的原始指标,并将其映射到现实生活中的指标。例如,在此评估循环中,您可以将真阳性标记为检测到欺诈交易,将真阴性标记为接受合法交易,将误报标记为拒绝合法交易,将假阴性标记为接受欺诈交易:

final_metrics =
  model
  |> Axon.Loop.evaluator(model_state)
  |> Axon.Loop.metric(:true_positives, "fraud_declined", :running_sum)
  |> Axon.Loop.metric(:true_negatives, "legit_accepted", :running_sum)
  |> Axon.Loop.metric(:false_positives, "legit_declined", :running_sum)
  |> Axon.Loop.metric(:false_negatives, "fraud_accepted", :running_sum)
  |> Axon.Loop.run(batched_test, compiler: EXLA)

        
          
        
      

Running the cell will show:
运行单元将显示:

Batch: 20, fraud_accepted: 9 fraud_declined: 43 legit_accepted: 42214 legit_declined: 742

So overall our model correctly declined 43 transactions and incorrectly declined 742 transactions. It accepted nine fraudulent transactions and 42214 legitimate transactions. The model declined about 2% of the legitimate transactions in the dataset. These metrics aren’t bad, but there’s definitely room for improvement. See if you can tweak the model or training process to achieve better performance.
因此,总体而言,我们的模型正确拒绝了 43 笔交易,错误拒绝了 742 笔交易。受理欺诈交易9笔,合法交易42214笔。该模型拒绝了数据集中约 2% 的合法交易。这些指标还不错,但肯定还有改进的余地。看看您是否可以调整模型或训练过程以获得更好的性能。

Moving Forward 向前进

In this post, you learned how to build an algorithm that identifies fraudulent credit card transactions. With a trained model, you could opt to expose the model’s predictions as a service in your system and provide real-time decisions on transactions. There are a number of architectural decisions which go into building a real-time machine learning system. Stay tuned for next month to see what those considerations are, and what makes Elixir a great choice for building real-time machine learning. Until next time :)
在本文中,您学习了如何构建识别欺诈性信用卡交易的算法。使用经过训练的模型,您可以选择将模型的预测公开为系统中的一项服务,并提供有关交易的实时决策。构建实时机器学习系统有许多架构决策。请继续关注下个月,看看这些注意事项是什么,以及是什么让 Elixir 成为构建实时机器学习的绝佳选择。直到下一次 :)

Deploying Machine Learning Models with Elixir
使用 Elixir 部署机器学习模型

Paper with checkboxes

Machine Learning Advisor
机器学习顾问

Sean Moriarity  肖恩·莫里亚蒂

Introduction 介绍

In my last post, we walked through how to build an Axon model to detect credit card fraud from anonymized features of real credit card transactions.
在我的上一篇文章中,我们介绍了如何构建 Axon 模型以从真实信用卡交易的匿名特征中检测信用卡欺诈。

While the final model performed relatively well at the fraud detection task, we can’t actually detect any fraud from our Livebook. Models aren’t meant to live in a notebook forever. After your model has been validated-which is a beast of a topic I will discuss in another post-you should begin to consider how you will put your model into production.
虽然最终模型在欺诈检测任务中表现相对较好,但我们实际上无法从我们的 Livebook 中检测到任何欺诈行为。模型并不意味着永远存在于笔记本中。在你的模型得到验证后 —— 这是我将在另一篇文章中讨论的一个主题 —— 你应该开始考虑如何将你的模型投入生产。

In this post, we’ll go over considerations for model deployment, and what model deployment scenarios might look like for this particular example. This is the first of a few posts I plan to do on machine learning operations or MLOps.
在这篇文章中,我们将讨论模型部署的注意事项,以及这个特定示例的模型部署场景。这是我计划在机器学习操作或 MLOps 上发表的几篇文章中的第一篇。

It’s important to note that there is no definitive guide to model deployments. What your model looks like in production depends entirely on your business needs.
请务必注意,没有关于模型部署的权威指南。您的模型在生产中的外观完全取决于您的业务需求。

Even prior to training your model you should set forth objectives and evaluate your model against those objectives to determine if the model you choose to deploy meets your needs, or if training a model is even worth it.
即使在训练您的模型之前,您也应该设定目标并根据这些目标评估您的模型,以确定您选择部署的模型是否满足您的需求,或者训练模型是否值得。

While this tutorial focuses mainly on deployment options for Deep Learning based models, you might find much more success, and a lot less cost, in deploying simpler models. 80% of “machine learning” problems can probably reasonably be solved with linear regression.
虽然本教程主要关注基于深度学习的模型的部署选项,但您可能会发现部署更简单的模型更成功,成本更低。 80% 的“机器学习”问题都可以用线性回归合理解决。

You should really only introduce complexity when it becomes absolutely necessary, or if you’re an academic trying to get published.
你真的应该只在绝对必要时才引入复杂性,或者如果你是一名试图发表论文的学者。

Exporting and Versioning Large Models
导出和版本化大型模型

In order to operationalize a model, you need a way to bring it outside of the environment you trained it in. Fortunately, Axon recently introduced two new functions for the purpose of exporting trained neural networks: Axon.serialize/3 and Axon.deserialize/2.
为了操作模型,您需要一种方法将其带到您训练它的环境之外。幸运的是,Axon 最近引入了两个新功能,用于导出训练有素的神经网络: Axon.serialize/3Axon.deserialize/2

Axon.serialize/3 serializes tuples of {model, params} into an Elixir binary which you can write to an external file for later use. Under the hood, Axon uses :erlang.term_to_binary/2; however, if you attempt to deserialize the serialized model with :erlang.binary_to_term, you won’t get the result you’re expecting.
Axon.serialize/3{model, params} 的元组序列化为 Elixir 二进制文件,您可以将其写入外部文件供以后使用。在引擎盖下,Axon 使用 :erlang.term_to_binary/2 ;但是,如果您尝试使用 :erlang.binary_to_term 反序列化序列化模型,您将不会得到预期的结果。

Before converting to a binary, Axon does some transformations to get both the Axon model and parameter container into a form suitable for serialization. The transformations are implementation details that help guarantee backwards compatibility as Axon is developed moving forward. You can reasonably assume that any model serialized using Axon.serialize/3 can be deserialized with Axon.deserialize/2.
在转换为二进制文件之前,Axon 会进行一些转换,以使 Axon 模型和参数容器都变成适合序列化的形式。转换是实现细节,有助于保证 Axon 向前发展时的向后兼容性。您可以合理地假设使用 Axon.serialize/3 序列化的任何模型都可以使用 Axon.deserialize/2 反序列化。

To ensure compatibility, you should always serialize models with Axon.serialize/3 and always deserialize models with Axon.deserialize/2.
为确保兼容性,您应该始终使用 Axon.serialize/3 序列化模型并始终使用 Axon.deserialize/2 反序列化模型。

As a final consideration, you should only deserialize models from trusted sources. Axon.deserialize/2 uses the :safe option in :erlang.binary_to_term/2 under the hood; however, you shouldn’t attempt to load models from untrusted sources.
作为最后的考虑,您应该只反序列化来自可信来源的模型。 Axon.deserialize/2 在后台使用 :erlang.binary_to_term/2 中的 :safe 选项;但是,您不应尝试从不受信任的来源加载模型。

In the credit card fraud example, you train a model with the following training loop:
在信用卡欺诈示例中,您使用以下训练循环训练模型:

model_state =
  model
  |> Axon.Loop.trainer(loss, optimizer)
  |> Axon.Loop.metric(:precision)
  |> Axon.Loop.metric(:recall)
  |> Axon.Loop.run(batched_train, epochs: 30, compiler: EXLA)

        
          
        
      

model_state represents params in an Axon {model, params} tuple. You can write a serialized representation of your Axon model to disk by adding the following lines after your training loop:
model_state 代表轴突 {model, params} 元组中的 params 。您可以通过在训练循环后添加以下行,将 Axon 模型的序列化表示写入磁盘:

model
|> Axon.serialize(params)
|> then(&File.write!("model.axon", &1))

        
          
        
      

This will save your serialized version to model.axon in the current working directory. You should save all of your Axon models with .axon as a convention. You can then easily re-use your model later on by reading it from disk into memory:
这会将您的序列化版本保存到当前工作目录中的 model.axon 。作为惯例,您应该使用 .axon 保存所有 Axon 模型。然后,您可以稍后通过将模型从磁盘读入内存来轻松地重新使用您的模型:

{model, params} = File.read!("model.axon") |> Axon.deserialize()
Axon.predict(model, params, input)

        
          
        
      

As you’ll see later on, there are other ways you’ll need to “persist” your models for use with external serving solutions. In that case, I still recommend saving a .axon version of your models in addition to other artifacts necessary for your deployment scenario. This will guarantee you can iterate and experiment with your model from Axon and convert to other persistence formats if necessary.
正如您稍后将看到的,您需要通过其他方式来“持久化”您的模型,以便与外部服务解决方案一起使用。在这种情况下,除了部署方案所需的其他工件之外,我仍然建议保存模型的 .axon 版本。这将保证您可以从 Axon 迭代和试验您的模型,并在必要时转换为其他持久性格式。

Persisting your model for use in deployments necessitates at least some sort of storage solution, and, if you plan on iterating over multiple models (which you should), a versioning solution for models as well.
持久化您的模型以用于部署至少需要某种存储解决方案,而且,如果您计划迭代多个模型(您应该这样做),还需要模型的版本控制解决方案。

For small projects, you can probably get away with just throwing models in a models directory and using git lfs. For larger projects, you’ll definitely want a better solution. Model version control has different flavors and requirements at each step of the model development and deployment cycle.
对于小型项目,您可能只需将模型放入 models 目录并使用 git lfs 即可。对于较大的项目,您肯定需要更好的解决方案。模型版本控制在模型开发和部署周期的每个步骤都有不同的风格和要求。

For example, when selecting and training models, you’ll want to track hyperparameters associated with each model, as well as the metadata and metrics collected during training which are associated with those versioned models.
例如,在选择和训练模型时,您需要跟踪与每个模型关联的超参数,以及在训练期间收集的与这些版本化模型关联的元数据和指标。

Once you select a model, you need a way to update your application to use your new and improved model–preferably without downtime. This is where out-of-the-box model serving tools come into play. Most model serving solutions let you set up model repositories on remote filesystems such as S3, and typically support serving multiple versions of the same model at different endpoints.
选择模型后,您需要一种方法来更新您的应用程序以使用新的和改进的模型 —— 最好是在不停机的情况下。这就是开箱即用的模型服务工具发挥作用的地方。大多数模型服务解决方案允许您在 S3 等远程文件系统上设置模型存储库,并且通常支持在不同端点提供同一模型的多个版本。

In a real-world project, you’d typically also save checkpoints during model training. Checkpoints are snapshots of state at various time intervals. Axon allows you to checkpoint your entire training state using the Axon.Loop.checkpoint event handler.
在实际项目中,您通常还会在模型训练期间保存检查点。检查点是不同时间间隔的状态快照。 Axon 允许您使用 Axon.Loop.checkpoint 事件处理程序检查您的整个训练状态。

In the Python ecosystem, checkpointing state is seen as a form of “fault-tolerance” because it allows you to resume training from a last good state in the event of some training failure. If you have a long-running training job, it’s definitely good practice to add checkpoints at fixed intervals. For example, in the fraud detection example, the training loop looks like this:
在 Python 生态系统中,检查点状态被视为一种“容错”形式,因为它允许您在出现某些训练失败的情况下从上一个良好状态恢复训练。如果你有一个长时间运行的训练工作,那么以固定的时间间隔添加检查点绝对是个好习惯。例如,在欺诈检测示例中,训练循环如下所示:

model_state =
  model
  |> Axon.Loop.trainer(loss, optimizer)
  |> Axon.Loop.metric(:precision)
  |> Axon.Loop.metric(:recall)
  |> Axon.Loop.run(batched_train, epochs: 30, compiler: EXLA)

        
          
        
      

You can simply add Axon.Loop.checkpoint to the loop:
您可以简单地将 Axon.Loop.checkpoint 添加到循环中:

model_state =
  model
  |> Axon.Loop.trainer(loss, optimizer)
  |> Axon.Loop.metric(:precision)
  |> Axon.Loop.metric(:recall)
  |> Axon.Loop.checkpoint()
  |> Axon.Loop.run(batched_train, epochs: 30, compiler: EXLA)

        
          
        
      

This will save training checkpoints under the checkpoints path with the file pattern checkpoint_{epoch}_{iteration}.ckpt after every training epoch. You can select any checkpoint and resume training with a few Axon functions:
这将在每个训练时期后将训练检查点保存在 checkpoints 路径下,文件模式为 checkpoint_{epoch}_{iteration}.ckpt 。您可以选择任何检查点并使用一些 Axon 函数恢复训练:

# Load the last checkpoint from epoch 30
path = "checkpoints/checkpoint_30_1000.ckpt"
last_ckpt_state =
  path
  # read file
  |> File.read!()
  # deserialize last training state
  |> Axon.Loop.deserialize_state()
# Create a loop and tell Axon to run from previous state
model_state =
  model
  |> Axon.Loop.trainer(loss, optimizer)
  |> Axon.Loop.metric(:precision)
  |> Axon.Loop.metric(:recall)
  |> Axon.Loop.from_state(last_ckpt_state)
  |> Axon.Loop.run(batched_train, epochs: 30, compiler: EXLA)

        
          
        
      

Model Deployment Scenarios
模型部署场景

With a trained model in hand, you need to go about integrating it into your application.
有了训练有素的模型,您需要着手将其集成到您的应用程序中。

What that integration looks like is application dependent; however, there are a few common scenarios to consider when it comes to model deployment. Your deployment scenario will dictate how you integrate your model into a production solution. One thing that you’ll probably find is common to your deployment scenario no matter what is that latency is king.
该集成的外观取决于应用程序;但是,在模型部署方面,有一些常见的场景需要考虑。您的部署方案将决定您如何将模型集成到生产解决方案中。您可能会发现,无论延迟是什么,您的部署场景都很常见。

Not considering functional performance (e.g. how accurate a model is), latency is the most critical metric to consider during model deployment. You would probably have more positive user feedback from serving random predictions in milliseconds than serving perfect predictions in minutes to hours (Please don’t actually do this, it’s incredibly irresponsible). Latency should be the driving consideration when thinking about your deployment scenario.
不考虑功能性能(例如模型的准确性),延迟是模型部署期间要考虑的最关键指标。与在几分钟到几小时内提供完美预测相比,在几毫秒内提供随机预测可能会得到更多积极的用户反馈(请不要真的这样做,这是非常不负责任的)。在考虑您的部署方案时,延迟应该是主要考虑因素。

Cloud vs. Edge 云对比边缘

The first thing to consider is whether you want to serve inferences from the cloud or at the edge. For simplicity, I am grouping on-prem solutions into the cloud bucket.
首先要考虑的是您是想从云端还是在边缘提供推理服务。为简单起见,我将本地解决方案分组到云存储桶中。

Cloud inference happens over the network. The model(s) live on a server at some endpoint, users make requests to the endpoint and receive inferences back. Edge deployments happen on edge devices. The model lives on individual devices, such as mobile phones, and serves inferences on demand without making requests to some inference server.
云推理发生在网络上。模型存在于某个端点的服务器上,用户向端点发出请求并接收返回的推理。边缘部署发生在边缘设备上。该模型存在于移动电话等个人设备上,并按需提供推理服务,而无需向某些推理服务器发出请求。

In some applications the choice of cloud vs. edge is obvious. For example, you wouldn’t try to deploy GPT-3 at the edge.
在某些应用程序中,云与边缘的选择是显而易见的。例如,您不会尝试在边缘部署 GPT-3。

The choice is typically less obvious when you’re building an application meant to be used at the edge. In that case you need to consider how deploying to the cloud vs. deploying models at the edge impact your business needs.
当您构建要在边缘使用的应用程序时,选择通常不太明显。在这种情况下,您需要考虑部署到云与在边缘部署模型如何影响您的业务需求。

If you intend your application to be functional without a reliable network connection, or to yield low-latency predictions without access to high-speed internet, you’re definitely going to want an edge deployment.
如果您希望您的应用程序在没有可靠网络连接的情况下也能正常运行,或者在没有访问高速互联网的情况下产生低延迟预测,那么您肯定需要边缘部署。

Today, most edge devices come built in with accelerators and runtimes specifically optimized for machine learning. iPhones, for example, include the CoreML framework which allows you to run machine learning models at the edge. TensorFlow Lite is another extremely popular framework for edge ML deployments.
如今,大多数边缘设备都内置了专门针对机器学习优化的加速器和运行时。例如,iPhone 包含 CoreML 框架,允许您在边缘运行机器学习模型。 TensorFlow Lite 是另一个非常流行的边缘 ML 部署框架。

The prospects for Elixir machine learning at the edge are particularly exciting when you consider Elixir already has an excellent edge framework in Nerves. If you’d like to help make this ecosystem grow, consider joining the EEF Machine Learning Working Group.
当您认为 Elixir 已经在 Nerves 中拥有出色的边缘框架时,边缘 Elixir 机器学习的前景尤其令人兴奋。如果您想帮助这个生态系统发展壮大,请考虑加入 EEF 机器学习工作组。

Cloud deployments are probably “easier” in the sense that you won’t need to target multiple devices and worry about optimizing your models to use less compute and storage. In the cloud you can always scale up compute, and the model you deploy can serve predictions to any device with a network connection.
从某种意义上说,云部署可能“更容易”,因为您不需要针对多个设备并担心优化模型以使用更少的计算和存储。在云中,您始终可以扩展计算,并且您部署的模型可以为任何具有网络连接的设备提供预测服务。

Additionally, updating models in the cloud is significantly easier than updating models at the edge. As with critical firmware updates, you can never get 100% of your users to upgrade to newer versions, so you can essentially guarantee users will be walking around with outdated versions of your model. With a cloud deployment, your server is always the source of truth.
此外,在云端更新模型比在边缘更新模型要容易得多。与关键固件更新一样,您永远不可能让 100% 的用户升级到更新的版本,因此您基本上可以保证用户会使用您的模型的过时版本。通过云部署,您的服务器始终是真实的来源。

Let’s consider our fraud detection model. Does it make sense to deploy in the cloud or at the edge? While I’m sure there’s a scenario where an edge deployment makes sense, our particular model is probably better deployed in the cloud. We can assume reliable internet because credit card transactions already take place over the internet.
让我们考虑一下我们的欺诈检测模型。部署在云端或边缘是否有意义?虽然我确信在某些情况下边缘部署是有意义的,但我们的特定模型可能更适合部署在云中。我们可以假设互联网可靠,因为信用卡交易已经在互联网上进行。

Online vs. Batch Inference 线上对战批量推理

Another consideration for your application is whether you will need to perform online or batch inference.
您的应用程序的另一个考虑因素是您是否需要执行在线推理或批量推理。

Online inference happens on-demand–your model serves predictions upon requests. Batch inference happens offline–you perform batch prediction jobs, typically at a fixed interval, and serve or use cached predictions at application runtime.
在线推理按需进行 —— 您的模型根据请求提供预测。批量推理是离线进行的 —— 您通常以固定的时间间隔执行批量预测作业,并在应用程序运行时提供或使用缓存的预测。

Batch inference jobs are kind of going out of style, but they might make sense for your application. In some ways it’s advantageous to perform batch inference because you can scale up compute and deal with predictions in bulk (rather than at batch size 1). Additionally, latency is slightly less of a concern–though you should still be concerned about the latency impact of serving cached predictions.
批量推理作业有点过时,但它们可能对您的应用程序有意义。在某些方面,执行批量推理是有利的,因为您可以批量扩展计算和处理预测(而不是批量大小为 1)。此外,延迟并不是一个值得关注的问题 —— 尽管您仍然应该关注提供缓存预测的延迟影响。

In certain applications batch inference doesn’t make sense at all. In some situations you can’t possibly make batch predictions on every input you might see in production.
在某些应用程序中,批量推理根本没有意义。在某些情况下,您不可能对生产中可能看到的每个输入进行批量预测。

Applications that make use of image recognition, for example, are not good candidates for batch inference. You can’t generate predictions for every possible image you might see at runtime ahead of time.
例如,使用图像识别的应用程序不适合进行批量推理。您无法提前为您可能在运行时看到的每张可能图像生成预测。

However, even if a model is suitable for batch inference, it doesn’t mean that you should serve it in an offline manner. For example, recommendation systems can be served offline–you periodically update a user’s “embedding” based on their shopping, viewing, and whatever history, and find similar products to their saved embedding at runtime.
然而,即使一个模型适合批量推理,也不意味着你应该以离线方式提供它。例如,推荐系统可以离线提供服务 —— 您可以根据用户的购物、浏览和任何历史记录定期更新用户的“嵌入”,并在运行时找到与他们保存的嵌入相似的产品。

However, adaptive online models typically make for an enhanced user experience. TikTok is a good, albeit extreme, example of the trend towards real-time machine learning. Their model is excellent at capturing, and keeping user attention through its recommendations.
但是,自适应在线模型通常可以增强用户体验。 TikTok 是实时机器学习趋势的一个很好的例子,尽管有些极端。他们的模型非常擅长通过其推荐来吸引和保持用户的注意力。

Even if you don’t plan on engineering the next TikTok, there are still benefits to performing inference in real-time. I highly recommend reading this article by Chip Huyen (also linked above) on the trend towards real-time machine learning.
即使您不打算设计下一个 TikTok,实时执行推理仍然有好处。我强烈推荐阅读 Chip Huyen 撰写的这篇关于实时机器学习趋势的文章(也在上面链接)。

Of course, online inference presents an entirely new set of engineering challenges.
当然,在线推理带来了一系列全新的工程挑战。

First, deep learning models benefit from scale. That is to say that most frameworks are designed to deal with large numbers of examples at once. In a production setting, you’re dealing with a batch size of 1, which means you can’t benefit from parallelization across many examples.
首先,深度学习模型受益于规模。也就是说,大多数框架都是为同时处理大量示例而设计的。在生产环境中,您处理的批处理大小为 1,这意味着您无法从多个示例的并行化中获益。

Additionally, serving predictions in an online manner wraps all of the same issues you’d deal with in a traditional application around a computationally expensive model. You need to be concerned with fault tolerance, network latency, etc.
此外,以在线方式提供预测服务将您在传统应用程序中处理的所有相同问题都围绕着计算成本高昂的模型进行了处理。您需要关注容错、网络延迟等。

You also need to consider model payload serialization–the format in which you send information to your model can impact both latency and performance. For example, it’s common to send requests with JSON; however, this might not be ideal for sending requests to models. JSON doesn’t support flexible numeric types, so you can silently lose precision sending requests via JSON, leading to degraded performance.
您还需要考虑模型负载序列化 —— 您向模型发送信息的格式会影响延迟和性能。例如,使用 JSON 发送请求很常见;但是,这可能不适合向模型发送请求。 JSON 不支持灵活的数字类型,因此您可以通过 JSON 悄悄地丢失精度发送请求,从而导致性能下降。

Fortunately, most of the considerations you should have when serving a model for online inference have been built into industry standard model serving solutions. Before we discuss these solutions, I’ll briefly discuss how and why you might settle on a pure Elixir solution.
幸运的是,在为在线推理服务模型时应该考虑的大部分问题都已内置到行业标准模型服务解决方案中。在我们讨论这些解决方案之前,我将简要讨论您如何以及为何选择纯 Elixir 解决方案。

As a final exercise in determining what kind of deployment makes sense, consider the fraud detection model. Should you perform predictions in an online or offline manner? This decision largely stems from business requirements, and I don’t think you could go wrong in either case.
作为确定哪种部署有意义的最后练习,请考虑欺诈检测模型。您应该以在线还是离线方式进行预测?这个决定主要源于业务需求,我认为在任何一种情况下你都不会出错。

In a perfect world, you’d be able to immediately decline suspicious transactions before they go through; however, that might not be feasible. Instead, it might be more suitable to validate transactions in bulk very frequently.
在一个完美的世界中,您可以在可疑交易通过之前立即拒绝它们;然而,这可能不可行。相反,它可能更适合非常频繁地批量验证交易。

Model Serving Solutions
模型服务解决方案

Model serving solutions are software designed specifically to solve the challenges associated with the deployment of deep learning models at scale. Serving online models is difficult. Model serving solutions are designed to make it easier. Let’s quickly touch on the features you get out of the box with most open source model servers.
模型服务解决方案是专为解决与大规模部署深度学习模型相关的挑战而设计的软件。为在线模型提供服务很困难。模型服务解决方案旨在使其更容易。让我们快速了解一下大多数开源模型服务器开箱即用的功能。

Flexibility 灵活性

While there are framework-specific model servers we will discuss later, most model servers will support multiple frameworks out of the box and also give you the option to add runtimes to serve custom model formats.
虽然我们稍后将讨论特定于框架的模型服务器,但大多数模型服务器将支持开箱即用的多个框架,并且还为您提供添加运行时以服务自定义模型格式的选项。

As an example, NVIDIA Triton Inference Server supports TensorRT, ONNX, TensorFlow, Torch, and more. There are also abstractions which allow you to build custom runtimes into the existing server infrastructure.
例如,NVIDIA Triton 推理服务器支持 TensorRT、ONNX、TensorFlow、Torch 等。还有一些抽象允许您将自定义运行时构建到现有服务器基础架构中。

Autobatching 自动批处理

As I mentioned before, online model serving typically happens with a batch size of 1. If you get overlapping requests, you’ll have to wait until your model processes each request before moving on to the next 1. This is slow! There’s generally minimal latency impact when adding additional examples to a batch because model runtimes process batches in parallel.
正如我之前提到的,在线模型服务通常以 1 的批量大小发生。如果您收到重叠的请求,则必须等到您的模型处理完每个请求后再处理下一个请求。这很慢!向批处理中添加额外示例时,延迟影响通常很小,因为模型运行时并行处理批处理。

Processing a single example 1 at a time is inefficient. Autobatching solves this problem by settling for a slight latency increase waiting for requests, before processing multiple requests at once.
一次处理一个示例 1 是低效的。自动批处理通过在一次处理多个请求之前解决等待请求的轻微延迟增加来解决这个问题。

For example, let’s say you want to autobatch requests every 10 milliseconds with a maximum batch size of 64. When your model server receives a request, it will wait for 10 milliseconds from the first request, or until the queue fills up with 64 entries before sending any of the requests to the model. For models which receive lots of concurrent requests, auto batching can have a massive impact on performance.
例如,假设您希望每 10 毫秒自动批处理一次请求,最大批处理大小为 64。当您的模型服务器收到请求时,它将从第一个请求开始等待 10 毫秒,或者直到队列填满之前的 64 个条目向模型发送任何请求。对于接收大量并发请求的模型,自动批处理会对性能产生巨大影响。

Versioning 版本控制

A good model server should allow you to serve multiple versions of the same model. As you deploy a new version of a model to production, you might want to perform some A/B testing on your new model versus your old model before fully rolling everyone over to the new version. To do this, you’ll need to be able to serve multiple versions of your model from different endpoints.
一个好的模型服务器应该允许你服务于同一模型的多个版本。当您将模型的新版本部署到生产环境时,您可能希望在将所有人完全迁移到新版本之前,对新模型与旧模型进行一些 A/B 测试。为此,您需要能够从不同端点提供模型的多个版本。

Concurrent execution 并发执行

The need to server multiple versions of a model at the same time also implies that a good model server should be capable of serving models concurrently. With limited compute this can be a challenge, as both models might not be capable of executing simultaneously on the same server. In these cases, your model server needs to deal with load balancing.
同时为一个模型的多个版本提供服务的需求也意味着一个好的模型服务器应该能够同时为模型提供服务。由于计算有限,这可能是一个挑战,因为两个模型可能无法在同一台服务器上同时执行。在这些情况下,您的模型服务器需要处理负载平衡。

Many more not listed 还有很多没有列出

There are many other challenges not listed here. There isn’t a model serving solution which solves all of the challenges you might run into perfectly. Additionally, this is still very much an evolving field. It’s only been 10 years (!!) since deep learning had its coming out moment, and even more recently that Google and other large companies started successfully putting deep learning into production.
这里没有列出许多其他挑战。没有一种模型服务解决方案可以完美解决您可能遇到的所有挑战。此外,这仍然是一个不断发展的领域。距离深度学习问世只有 10 年(!!),甚至最近谷歌和其他大公司开始成功地将深度学习投入生产。

Serving Axon 服务轴突

With the challenges of model serving in mind, you need to pick a serving solution and get your Axon model into a format capable of running within that solution. For this example, you’ll use NVIDIA Triton Inference Server.
考虑到模型服务的挑战,您需要选择一个服务解决方案并将您的 Axon 模型转化为能够在该解决方案中运行的格式。对于此示例,您将使用 NVIDIA Triton 推理服务器。

NVIDIA Triton Inference Server is an industry standard model serving solution, and comes out of the box with all of the features you need to successfully put your models into production. You can get up and running relatively quickly with Docker in a few steps.
NVIDIA Triton 推理服务器是一种行业标准模型服务解决方案,开箱即用,具有成功将模型投入生产所需的所有功能。只需几个步骤,您就可以相对快速地启动并运行 Docker。

Step 1: Convert your model to ONNX
第 1 步:将模型转换为 ONNX

Triton supports a ton of different execution formats. You can serve TensorFlow SavedModels, PyTorch TorchScript models, TensorRT Models, ONNX models, etc.
Triton 支持大量不同的执行格式。您可以提供 TensorFlow SavedModels、PyTorch TorchScript 模型、TensorRT 模型、ONNX 模型等。

There are roundabout ways to convert Axon to TensorFlow SavedModels and PyTorch Torchscript models; however, there is much more support for ONNX conversion at this time.
有一些迂回的方法可以将 Axon 转换为 TensorFlow SavedModels 和 PyTorch Torchscript 模型;但是,目前对 ONNX 转换的支持要多得多。

If you’ve saved your model as per the directions outlined in a previous section of this post, you can load and serialize the model to ONNX with the following code:
如果您已按照本文前一节中概述的说明保存了模型,则可以使用以下代码将模型加载并序列化到 ONNX:

Mix.install([
  {:axon_onnx, "~> 0.1.0-dev", github: "elixir-nx/axon_onnx",}
  {:axon, "~> 0.1.0-dev", github: "elixir-nx/axon", override: true}
])
{model, params} = File.read!("model.axon") |> Axon.deserialize()
AxonOnnx.Serialize.__export__(model, params, path: "model.onnx")

        
          
        
      

After running, you’ll have an ONNX model which can be served from Triton. Note that the AxonOnnx API is currently experimental and subject to breaking changes. However, the idea that you can simply export a model to ONNX from an Axon model and parameters will remain the same.
运行后,您将拥有一个可以从 Triton 提供服务的 ONNX 模型。请注意,AxonOnnx API 目前处于试验阶段,可能会发生重大变化。但是,您可以简单地将模型从 Axon 模型和参数导出到 ONNX 的想法将保持不变。

Step 2: Make a model repository
第 2 步:制作模型存储库

Triton requires you to specify a model repository with a structure that looks something like:
Triton 要求您指定一个模型存储库,其结构类似于:

- <model_repository_name>
  - <model_endpoint_name>
    - 1
    - 2
  - <model_endpoint_name>
    - 1

Where 1 and 2 are versions 1 and 2 of your model respectively. Triton has a lot of neat things you can do with your model repository.
12 分别是模型的版本 1 和 2。 Triton 有很多你可以用你的模型库做的巧妙的事情。

For example, you can load models from S3 or GCP, dynamically load models, and more. Your directory structure for this example should look like:
例如,您可以从 S3 或 GCP 加载模型、动态加载模型等。此示例的目录结构应如下所示:

- models
  - fraud
    - 1
      - model.onnx

Step 3: Pull Triton 第 3 步:拉 Triton

It’s easiest to start Triton from a Docker container. You can pull the pre-built container using:
从 Docker 容器启动 Triton 是最简单的。您可以使用以下方法拉取预构建容器:

docker pull nvcr.io/nvidia/tritonserver:21.12-py3

Step 4: Start Triton 第 4 步:启动 Triton

With Triton pulled, you can now start the server:
拉出 Triton 后,您现在可以启动服务器:

docker run --rm -p8000:8000 -p8001:8001 -p8002:8002 -v/home/sean/blog/deployment_example/models:/models nvcr.io/nvidia/tritonserver:21.12-py3 tritonserver --model-repository=/models --strict-model-config=false

You should replace the mounting of -v/home/sean/blog/deployment_example/models with the absolute path to where you created your model repository. This binds ports 8000, 8001, and 8002 to the Triton server. After a short wait, you should see:
您应该将 -v/home/sean/blog/deployment_example/models 的挂载替换为创建模型存储库的绝对路径。这会将端口 8000、8001 和 8002 绑定到 Triton 服务器。稍等片刻后,您应该会看到:

+-------+---------+--------+
| Model | Version | Status |
+-------+---------+--------+
| fraud | 1       | READY  |
+-------+---------+--------+

You’re ready to make predictions!
你已经准备好做出预测了!

Step 5: Make a prediction
第 5 步:做出预测

Triton serves both an HTTP and GRPC endpoint which you can query to make predictions. To get a quick prediction from the HTTP endpoint, you can use the following script:
Triton 同时提供 HTTP 和 GRPC 端点,您可以查询它们以进行预测。要从 HTTP 端点获得快速预测,您可以使用以下脚本:

Mix.install([
  {:nx, "~> 0.1.0"},
  {:req, "~> 0.2.0"},
  {:jason, "~> 1.2"}
])
data = Nx.random_uniform({1, 30}) |> Nx.to_flat_list()
req_data = %{
  "inputs": [
    %{
      "name": "input_0",
      "datatype": "FP32",
      "shape": [1, 30],
      "data": data
    }
  ]
}
Req.post!("http://localhost:8000/v2/models/fraud/infer", Jason.encode!(req_data)) |> IO.inspect

Remember that the model you serialized has an input shape of {nil, 30}, so we generate a request which sends a single example with random inputs. FP32 indicates that the model expects {:f, 32} input types.
请记住,您序列化的模型具有 {nil, 30} 的输入形状,因此我们生成一个请求,该请求发送一个带有随机输入的示例。 FP32 表示模型需要 {:f, 32} 输入类型。

Triton supports requests in a plain text format (like you see here), and a binary format. In practice you should probably use the binary format, but for the sake of simplicity the plain text format is shown here.
Triton 支持纯文本格式(如您在此处看到的)和二进制格式的请求。实际上,您可能应该使用二进制格式,但为了简单起见,此处显示的是纯文本格式。

If you run in the terminal, you’ll see a response that looks something like:
如果您在终端中运行,您将看到类似以下内容的响应:

%Req.Response{
  body: %{
    "model_name" => "fraud",
    "model_version" => "1",
    "outputs" => [
      %{
        "data" => [0.22510722279548645],
        "datatype" => "FP32",
        "name" => "dense_4",
        "shape" => [1, 1]
      }
    ]
  },
  ...
}

Notice how you can parse the prediction into a Tensor for further processing with Nx! You’ve successfully deployed a model using Elixir, ONNX, and NVIDIA Triton Inference Server!
请注意如何将预测解析为张量以使用 Nx 进行进一步处理!您已经使用 Elixir、ONNX 和 NVIDIA Triton 推理服务器成功部署了一个模型!

Conclusion 结论

This post was meant to serve as an introduction to model deployment and model serving using Elixir and Axon. It’s impossible to cover every detail of model deployment in a single post, but I hope this post points you in the right direction, and gets you thinking about how you can start serving your trained Axon models for others to use.
这篇文章旨在介绍使用 Elixir 和 Axon 进行模型部署和模型服务。不可能在一篇文章中涵盖模型部署的每个细节,但我希望这篇文章能为您指明正确的方向,并让您思考如何开始为其他人提供训练有素的 Axon 模型。

In my next post, we’ll cover model evaluation and monitoring! Until then!
在我的下一篇文章中,我们将介绍模型评估和监控!直到那时!

What is Machine Learning Anyway?
什么是机器学习?

Small blue robot in front of laptop. Credit: @Irrmago via Twenty20

Machine Learning Advisor
机器学习顾问

Sean Moriarity  肖恩·莫里亚蒂

Introduction 介绍

If you’ve read the book Made to Stick by Chip and Dan Heath, you might recall the excerpt on “listeners” and “tappers.”
如果您读过 Chip 和 Dan Heath 合着的《Made to Stick》一书,您可能还记得关于“听众”和“窃听者”的摘录。

Very early on in the book, they describe an experiment conducted in the 90s by Stanford PhD Elizabeth Newton. Newton’s experiment divides individuals into “listeners” and “tappers.” Tappers are tasked with tapping assigned songs such as “Happy Birthday” and “The Star Spangled Banner.” Listeners are tasked with identifying the songs their counterparts are tapping.
在这本书的开头,他们描述了斯坦福大学博士伊丽莎白牛顿在 90 年代进行的一项实验。牛顿的实验将个体分为“听者”和“窃听者”。 Tappers 的任务是敲击指定的歌曲,例如“生日快乐”和“星条旗”。听众的任务是识别他们的同行正在敲击的歌曲。

Before the experiment started, tappers were asked to assess the probability with which the listener would correctly guess the song they were assigned to tap. The average tapper believed their listener counterpart would correctly guess the song about 50% of the time. In reality, listeners only correctly guessed the song three out of 120 times.
在实验开始之前,敲击者被要求评估听众正确猜出他们被分配敲击的歌曲的概率。敲击者平均相信他们的听众会在大约 50% 的时间内正确猜出这首歌。实际上,听众在 120 次中只猜对了这首歌的 3 次。

The purpose of the excerpt in the book is to demonstrate the Curse of Knowledge: When communicating topics from a position of knowledge or authority, it’s often difficult to fully understand what it’s like to lack that particular knowledge or authority.
书中摘录的目的是展示知识的诅咒:从知识或权威的角度交流话题时,通常很难完全理解缺乏特定知识或权威的感觉。

In other words, you often have a hard time remembering what it was like before you knew what you know now, and it makes it difficult to convey difficult topics to beginners. I find this is especially true for machine learning–so this post is an attempt to convey the foundations and basics of machine learning from the perspective of a listener.
换句话说,你常常很难记住你现在知道的东西之前的样子,这使得向初学者传达困难的话题变得困难。我发现机器学习尤其如此 —— 所以这篇文章试图从听众的角度传达机器学习的基础和基础知识。

WARNING: I haven’t finished the book yet, so I might fail miserably :)
警告:我还没有完成这本书,所以我可能会惨败:)

Machines that Learn? 学习的机器?

The first question you might ask when starting your journey to Machine Learning enlightenment is What is machine learning? I think it’s easiest to understand machine learning with some historical context, and with a greater understanding of artificial intelligence as a whole.
在开始机器学习启蒙之旅时,您可能会问的第一个问题是什么是机器学习?我认为通过一些历史背景来理解机器学习是最容易的,并且对整个人工智能有更好的理解。

You probably have some intuition about the definition of artificial intelligence. A formal definition of artificial intelligence is a system or systems which mimics human intelligence on a particular task.
你可能对人工智能的定义有一些直觉。人工智能的正式定义是在特定任务上模仿人类智能的一个或多个系统。

Of course, this leaves a ton of room for interpretation around the definition of intelligence. Most academics can probably agree on some of the elements that must be present in order for a system to be considered intelligent; however, there are probably equally as many things they would vehemently disagree about.
当然,这为智能的定义留下了大量的解释空间。大多数学者可能会就系统被认为是智能的必须存在的一些要素达成一致;但是,他们可能会强烈反对的事情同样多。

For a general understanding, I think it’s fair to argue that you can consider any system which attempts to mimic the senses, behaviors, or actions of humans to be artificial intelligence. The common aspects of human cognitive ability you’ll see mimicked in artificial intelligence are things like vision, speech, writing, planning, movement, learning, etc.
就一般理解而言,我认为可以公平地说,您可以将任何试图模仿人类的感官、行为或行为的系统视为人工智能。你会看到人工智能模仿的人类认知能力的共同方面是视觉、言语、写作、计划、运动、学习等。

Artificial Intelligence as an academic field of research sprung out of a 1956 Dartmouth Workshop proposed by John McCarthy, Marvin Minsky, Claude Shannon, and Nathaniel Rochester. The workshop coined the term “artificial intelligence.” Since then, the field of artificial intelligence has experienced a number of booms and busts–years of incredible progress and growth followed by long stretches of stagnation or AI winter.
人工智能作为一个学术研究领域,起源于 1956 年由约翰·麦卡锡、马文·明斯基、克劳德·香农和纳撒尼尔·罗切斯特提出的达特茅斯研讨会。该研讨会创造了“人工智能”一词。从那时起,人工智能领域经历了多次繁荣和萧条 —— 多年的惊人进步和增长,随后是长时间的停滞或人工智能寒冬。

The early consensus in artificial intelligence research was that logic-based approaches were the most promising path to creating systems that mimicked human intelligence.
人工智能研究的早期共识是,基于逻辑的方法是创建模仿人类智能的系统的最有希望的途径。

At face value, this feels correct. The idea that the world is made up of concrete facts, patterns, and rules feels correct, and that we should be able to express these facts, patterns, and rules succinctly with logic is simple and somewhat beautiful.
从表面上看,这感觉是正确的。世界是由具体的事实、模式和规则组成的想法感觉是正确的,我们应该能够用逻辑简洁地表达这些事实、模式和规则,这很简单,也有些美。

Initially, logic-based approaches showed great promise and spawned a generation of research and tooling for logic-based artificial intelligence. You can explore many of these classic approaches on GitHub in Peter Norvig’s Paradigms in Artificial Intelligence Programming book.
最初,基于逻辑的方法显示出巨大的希望,并催生了一代基于逻辑的人工智能的研究和工具。您可以在 GitHub 上的 Peter Norvig 的《人工智能编程范例》一书中探索其中的许多经典方法。

Unfortunately, logic-based approaches break down on tasks that are seemingly simple, but layered with complexity.
不幸的是,基于逻辑的方法无法解决看似简单但层层复杂的任务。

For example, what set of rules would you use to describe images of birds? What set of rules would you use to describe the written English language? What set of rules would you use to describe the inflections and sound that make up English speech?
例如,您会使用哪些规则来描述鸟类的图像?你会用什么规则来描述书面英语?你会用什么规则来描述构成英语演讲的语调和发音?

You might be able to come up with a few rules for each of these questions, but you’ll probably quickly realize that your rules don’t cover certain edge cases, or there are specific exceptions to your rules. The problem with attempting to model systems with concrete rules is that the complexity of the ruleset increases exponentially with the complexity of the problem. Enumerating all of the rules which govern the complexity of real-world problems is incredibly challenging. What if there was a way to learn these rules through experience?
您可能能够为这些问题中的每一个提出一些规则,但您可能很快就会意识到您的规则不涵盖某些边缘情况,或者您的规则存在特定的例外情况。尝试使用具体规则对系统建模的问题在于,规则集的复杂性会随着问题的复杂性呈指数增长。列举支配现实世界问题复杂性的所有规则是非常具有挑战性的。如果有一种方法可以通过经验来学习这些规则呢?

Machine learning was born from the idea that machines could be programmed to learn from experience on given tasks. Perhaps they’d be given some initial knowledge about the world, but largely they would learn from observations. Machine learning is thus considered a subset of artificial intelligence. The formal definition of machine learning, as defined by Tom Mitchell in his 1997 book Machine Learning is:
机器学习诞生于这样一种想法,即可以对机器进行编程以从给定任务的经验中学习。也许他们会获得一些关于世界的初步知识,但主要是他们会从观察中学习。因此,机器学习被认为是人工智能的一个子集。机器学习的正式定义,正如 Tom Mitchell 在他 1997 年出版的《机器学习》一书中所定义的那样:

A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.
如果计算机程序在任务 T 中的性能(由 P 衡量)随着经验 E 的提高而提高,则称计算机程序从关于某类任务 T 和性能度量 P 的经验 E 中学习。

Those are the three defining characteristics of a machine learning algorithm: experience, tasks, and performance. The beauty of this definition is that it does not make any assumptions about the task, metrics, or training data. You can learn anything.
这些是机器学习算法的三个定义特征:经验、任务和性能。这个定义的美妙之处在于它没有对任务、指标或训练数据做出任何假设。你可以学到任何东西。

How do machines learn? 机器如何学习?

Armed with a high-level understanding of what machine learning is, you’re probably more confused or at least more curious to understand how machines learn. Before diving into that, let’s revisit one of the fatal flaws of rule-based approaches.
对什么是机器学习有了更深入的了解后,您可能会对机器的学习方式感到更加困惑或至少更加好奇。在深入探讨之前,让我们重新审视基于规则的方法的致命缺陷之一。

Remember that the number of rules required to capture a complex process increases exponentially with the complexity of the system. Why is that?
请记住,捕获复杂流程所需的规则数量会随着系统的复杂性呈指数增长。这是为什么?

Rule-based approaches are rigid by default–they are comprised of facts, and query existing collections of facts with observations. There is no room for uncertainty, and thus no room for handling the countless edge cases that inevitably arrive in the chaos of the real-world. If, instead, we assumed uncertainty as a principle, we could capture the complexity of the world with fewer, simpler rules.
默认情况下,基于规则的方法是严格的 —— 它们由事实组成,并通过观察查询现有的事实集合。没有不确定性的余地,因此也没有余地来处理不可避免地出现在现实世界混乱中的无数边缘案例。相反,如果我们假设不确定性是一个原则,我们就可以用更少、更简单的规则来捕捉世界的复杂性。

The classic example of uncertainty providing a better framework for modeling the real-world is from Ian Goodfellow’s Deep Learning Book.
Ian Goodfellow 的深度学习书是不确定性的经典示例,它为现实世界建模提供了更好的框架。

Goodfellow and his co-authors ask the question, What birds fly? A rule based on uncertainty for this question might look something like: Most birds fly. You can even quantify this uncertainty with a probability if you have enough information: 80% of birds fly. Alternatively, a rule without uncertainty would look something like: All birds fly, except young birds, birds that are injured, penguins, ….. with a countless number of exceptions only an ornithologist would know.
Goodfellow 和他的合著者提出了这样一个问题:什么鸟会飞?这个问题的基于不确定性的规则可能类似于:大多数鸟都会飞。如果你有足够的信息,你甚至可以用概率来量化这种不确定性:80% 的鸟会飞。或者,一个没有不确定性的规则看起来像这样:所有的鸟都会飞,除了幼鸟、受伤的鸟、企鹅……除了无数的例外只有鸟类学家知道。

The strength of modern machine learning approaches is that they are built on uncertainty. They make use of probability theory–which is the study of rules that quantify uncertainty–to model complex real-world processes.
现代机器学习方法的优势在于它们建立在不确定性之上。他们利用概率论 —— 对量化不确定性的规则的研究 —— 来模拟复杂的现实世界过程。

Okay, but how do they do that? To answer this, we’ll consider machine learning in a supervised setting, because that’s the easiest to understand. The fundamental question in supervised learning is: Given X, how certain are you of Y? You can apply this to any supervised learning problem you might be familiar with:
好吧,但他们是怎么做到的呢?为了回答这个问题,我们将考虑在监督环境中进行机器学习,因为这是最容易理解的。监督学习的基本问题是:给定 X,你对 Y 有多确定?您可以将其应用于您可能熟悉的任何监督学习问题:

  • Given this image, how certain are you that it contains a bird?
    鉴于这张图片,你有多确定它包含一只鸟?
  • Given this text, how certain are you that the sentiment is positive?
    鉴于此文本,您有多确定情绪是积极的?
  • Given these stats, how certain are you the Sixers will win the NBA finals? (I’m not very certain.)
    鉴于这些数据,你对 76 人队赢得 NBA 总决赛有多大把握? (我不是很确定。)

In reality, a machine learning model is just a function:
实际上,机器学习模型只是一个函数:

def model(observation) do
  likelihood_of_event(observation)
end

        
          
        
      

Which returns the likelihood of an event given the observation. These likelihoods are expressed as probabilities, which is why you’ll often see scores associated with predictions. For example, if you train an Axon model to predict whether or not an image contains a bird, you’ll end up with predictions that look like:
它返回给定观察的事件的可能性。这些可能性表示为概率,这就是为什么您会经常看到与预测相关的分数。例如,如果您训练 Axon 模型来预测图像中是否包含鸟类,您最终会得到如下所示的预测:

#Nx.Tensor<
  f32
  0.75
>

        
          
        
      

This prediction can more or less be interpreted as a model saying: “I am 75% certain this image contains a bird.” In reality, a machine learning model is just a function that takes an observation and returns a likelihood.
这个预测或多或少可以解释为一个模型在说:“我有 75% 的把握这张图片包含一只鸟。”实际上,机器学习模型只是一个接受观察并返回可能性的函数。

But how does a model take a picture of a bird and spit out a probability? The actual process depends on the model–the description you would receive for something like decision trees differs from that of neural networks. I will try to explain the process in more general terms.
但是一个模型如何拍一张鸟的照片并吐出一个概率呢?实际过程取决于模型 —— 您收到的对决策树之类的描述不同于神经网络。我将尝试用更一般的术语来解释这个过程。

First, imagine for a second you were asked to physically sort a number of images on a scale from 0 to 1. Images placed near zero nearly certainly don’t contain a bird. Images placed near one almost certainly do contain birds.
首先,想象一下你被要求按照从 0 到 1 的等级对大量图像进行物理排序。放置在零附近的图像几乎肯定不包含鸟。放在一张附近的图像几乎肯定包含鸟类。

This might be a difficult task to imagine because to you images concretely do or do not contain a bird. Rather than think of the problem as a discrete 0 or 1, try to consider the features of a bird and then think about what kind of images might lie left, middle, and center.
这可能是一项难以想象的任务,因为对您来说图像具体包含或不包含鸟。与其将问题视为离散的 0 或 1,不如尝试考虑鸟类的特征,然后考虑可能位于左侧、中间和中心的图像类型。

For example, you might place images of inanimate objects very close to 0. Images of other animals such as cats or dogs might be a little further from 0, but still not too far. As you drift towards the center you might find airplanes and other objects which have defining features in common with birds. As you approach 1, the images you place have more of the defining features of a bird.
例如,您可以将无生命物体的图像放置在非常接近 0 的位置。其他动物(如猫或狗)的图像可能离 0 稍远一些,但也不会太远。当您向中心漂移时,您可能会发现飞机和其他具有与鸟类相同的特征的物体。当您接近 1 时,您放置的图像具有更多鸟类的定义特征。

At the end of the exercise, you have essentially created a probability distribution of images that contain birds. Mentally, you transformed the input features you saw in the images into a probability. This is the same thing we do in machine learning.
在练习结束时,您实际上已经创建了包含鸟类的图像的概率分布。在心理上,您将在图像中看到的输入特征转换为概率。这与我们在机器学习中所做的相同。

Rather than manually transforming inputs into probabilities, you want to create a function that does it for you. You assume a parameterized probability distribution:
与其手动将输入转换为概率,不如创建一个函数来为您完成。您假设参数化概率分布:

def model(params, inputs) do
  likelihood = f(params, inputs)
  likelihood
end

        
          
        
      

In the above code, f/2 is a function which answers the question: Given inputs what is the probability of Y? This prediction is based on its parameterizationparams.
在上面的代码中, f/2 是一个回答问题的函数:给定 inputs Y 的概率是多少?该预测基于其参数化 params

The presence of params means f/2 can try to replicate many other distributions from a family of probability distributions. When I say probability distribution, I’m really just talking about something that resembles the physical scale you created for bird images earlier. So f/2 is a function which represents a probability distribution and is how we actually obtain probabilities from inputs. But, that still doesn’t fully answer the question of how inputs become probabilities.
params 的存在意味着 f/2 可以尝试从一系列概率分布中复制许多其他分布。当我说概率分布时,我实际上只是在谈论类似于您之前为鸟类图像创建的物理比例的东西。所以 f/2 是一个表示概率分布的函数,也是我们实际从输入中获取概率的方式。但是,这仍然没有完全回答输入如何变成概率的问题。

Remember from the exercise that you were asked to list the defining features of birds and use those features to predict probabilities. We can ask f/2 to do essentially the same thing. Our input images will come in as a vector or matrix of pixel color values. For each image, we’ll end up with just a bunch of numbers. We can transform these numbers in a number of ways using params assuming that params will “emphasize” certain groups of pixel values which contain defining bird features. For example, f/2 might look something like:
请记住,在练习中,您被要求列出鸟类的定义特征,并使用这些特征来预测概率。我们可以要求 f/2 做本质上相同的事情。我们的输入图像将以像素颜色值的向量或矩阵的形式出现。对于每张图片,我们最终只会得到一堆数字。我们可以使用 params 以多种方式转换这些数字,假设 params 将“强调”包含定义鸟类特征的某些像素值组。例如, f/2 可能类似于:

def f(params, input) do
  params
  |> Nx.dot(input)
  |> Nx.logistic()
end

        
          
        
      

This is a linear transformation of the input using params. Visually, you can imagine that all of our original images could be plotted somewhere in space. The linear transformation projects them into a different, perhaps more ordered orientation in space, and the logistic function squeezes this space into a scale from 0 to 1. This is where Nx and some understanding of linear algebra comes in handy–you can transform inputs in a number of ways in an attempt to map inputs to probabilities.
这是使用 paramsinput 的线性变换。在视觉上,你可以想象我们所有的原始图像都可以绘制在空间的某个地方。线性变换将它们投射到一个不同的、可能更有序的空间方向,逻辑函数将这个空间压缩到 0 到 1 的范围内。这就是 Nx 和对线性代数的一些理解派上用场的地方 —— 你可以将输入变换为尝试将输入映射到概率的多种方法。

Now, let’s assume that there exists some real function true_f and some true_params which represents the true probability distribution of all images that contain birds. And let’s assume that true_f looks identical to f:
现在,假设存在一些实函数 true_f 和一些 true_params 表示所有包含鸟类的图像的真实概率分布。假设 true_f 看起来与 f 相同:

def true_f(true_params, input) do
  true_params
  |> Nx.dot(input)
  |> Nx.logistic()
end

        
          
        
      

If you’re just given a sample of images and their corresponding result from true_f, how can you make f match true_f? In other words, how can you learn to model true_f with f? Well, you can try random guessing, or brute-force search; however, you’ll quickly find this to be a nearly impossible task. If you instead use some principles from optimization and probability theory, you can recover true_params and true_f more intelligently.
如果你只是从 true_f 中获得了图像样本及其相应的结果,你如何使 f 匹配 true_f ?换句话说,你如何学习用 f 来建模 true_f ?好吧,您可以尝试随机猜测或暴力搜索;然而,您很快就会发现这是一项几乎不可能完成的任务。如果改为使用优化和概率论中的一些原理,则可以更智能地恢复 true_paramstrue_f

Your goal is pretty concrete here, you want f to match true_f which means params have to match true_params. You want to minimize the difference between f and true_f.
您的目标在这里非常具体,您希望 f 匹配 true_f ,这意味着 params 必须匹配 true_params 。您想要最小化 ftrue_f 之间的差异。

But, how do you measure the difference between functions without access to true_params? Well, you can attempt to measure the difference by measuring the difference between each observation in the dataset given to you. In other words, you use your model to predict a probability for each image, measure the difference between the predicted probability and the true label, and then update your model accordingly. You end up with an objective that looks something like:
但是,如何在不访问 true_params 的情况下衡量函数之间的差异?那么,您可以尝试通过测量提供给您的数据集中每个观察值之间的差异来衡量差异。换句话说,您使用您的模型预测每个图像的概率,测量预测概率与真实标签之间的差异,然后相应地更新您的模型。您最终的目标类似于:

defn objective(params, input, label) do
  probability = model(params, input)
  measure_difference(label, probability)
end

        
          
        
      

The form of measure_difference depends on the type of problem you’re trying to solve. This difference function is often called a loss function. You now have a clear objective function that contains your model params–by minimizing the objective function you are in essence minimizing the difference between f and true_f. You are learning to represent true_f using f from observations.
measure_difference 的形式取决于您要解决的问题类型。这种差异函数通常称为损失函数。您现在有一个清晰的目标函数,其中包含您的模型 params —— 通过最小化目标函数,您实际上是在最小化 ftrue_f 之间的差异。您正在学习使用来自观察的 f 来表示 true_f

Okay, so how do you actually minimize objective/3? The most common form of optimization in machine learning is probability gradient descent.
好的,那么您实际上如何最小化 objective/3 ?机器学习中最常见的优化形式是概率梯度下降。

To explain gradient descent, I will use the same analogy I used in a previous blog post. Imagine you’re dropped somewhere randomly in the ocean and you’re asked to find the deepest point in the water using only your depth-finder as a guide. You could move about randomly, but you’d probably end up wandering pretty aimlessly for days. A smarter approach would be: Take depth measurements in small increments in every direction. Determine which direction is the direction of deepest descent. Move in that direction. Repeat.
为了解释梯度下降,我将使用我在之前的博文中使用的相同类比。想象一下,您随机掉落在海洋中的某个地方,并且被要求仅使用测深仪作为向导来找到水中的最深点。你可以随意四处走动,但最终可能会漫无目的地游荡好几天。更聪明的方法是:在每个方向上以小增量进行深度测量。确定哪个方向是最深下降方向。朝那个方向移动。重复。

This is the essence of gradient descent–your objective function is an ocean, and you want to find the deepest point by constantly descending in the direction of steepest descent.
这就是梯度下降的本质 —— 你的目标函数是一片海洋,你想通过不断向最速下降的方向下降找到最深点。

But, how can you find the direction of steepest descent? Mathematically, this is the gradient of a function. In Nx, this is what grad takes care of for you. You can use grad of your objective function with respect to params in order to update params slightly in a direction that minimizes the objective function on an example. Each step looks something like:
但是,你怎么能找到最速下降的方向呢?从数学上讲,这是函数的梯度。在 Nx 中,这就是 grad 为您处理的事情。您可以使用关于 params 的目标函数的 grad ,以便在最小化示例目标函数的方向上稍微更新 params 。每个步骤看起来像:

defn step(params, input, label) do
  direction_of_descent = grad(params, &objective(&1, input, label))
  params - direction_of_descent * 1.0e-2
end

        
          
        
      

Here grad with respect to params tells you how to update params to minimize objective. The scale by 1.0e-2 is your learning rate, and ensures you don’t take too large of a step in any given direction. If you repeat this process iteratively over a given dataset, you end up closely replicating true_f with f.
这里 grad 相对于 params 告诉您如何更新 params 以最小化 objective1.0e-2 的比例是你的学习率,确保你不会在任何给定的方向上迈出太大的一步。如果您在给定的数据集上迭代地重复此过程,您最终会密切复制 true_ff

What does this actually look like in Nx?
这在 Nx 中实际上是什么样子的?

Now that you know all of these concepts somewhat in an abstract sense, you should be able to better recognize what’s going on in a concrete sense. Open up a new Livebook and install Nx:
现在您已经从抽象意义上了解了所有这些概念,您应该能够更好地识别具体意义上发生的事情。打开一个新的 Livebook 并安装 Nx:

Mix.install([
  {:nx, "~> 0.2.0"}
])

        
          
        
      
:ok

Now let’s generate some training data. Rather than dealing with probabilities, let’s just assume true_f spits out a dot-product between params and x:
现在让我们生成一些训练数据。与其处理概率,不如假设 true_fparamsx 之间生成一个点积:

true_params = Nx.random_uniform({16})
true_f = fn params, x ->
  params
  |> Nx.dot(x)
end
train_data =
  for _ <- 1..10000 do
    x = Nx.random_uniform({16})
    {x, true_f.(true_params, x)}
  end

        
          
        
      
[
  {#Nx.Tensor<
     f32[16]
     [0.07947574555873871, 0.026878003031015396, 0.5867477655410767, 0.5776385068893433, 0.9754040241241455, 0.5079066753387451, 0.3611658215522766, 0.7247434854507446, 0.6224258542060852, 0.0817679837346077, 0.18870306015014648, 0.9963228702545166, 0.6838437914848328, 0.7353075742721558, 0.4642966091632843, 0.6851630210876465]
   >,
   #Nx.Tensor<
     f32
     4.527798652648926
   >},
  {#Nx.Tensor<
     f32[16]
     [0.3381848633289337, 0.3867352604866028, 0.5400522947311401, 0.03547948971390724, 0.7606191635131836, 0.1566101759672165, 0.291944682598114, 0.42579999566078186, 0.6438153982162476, 0.4992257356643677, 0.30716437101364136, 0.9808345437049866, 0.4933328628540039, 0.4456803798675537, 0.6096257567405701, 0.6286845207214355]
   >,
   #Nx.Tensor<
     f32
     4.5737714767456055
   >},
  {#Nx.Tensor<
     f32[16]
     [0.06392016261816025, 0.5789399743080139, 0.27656567096710205, 0.6276429295539856, 0.5487242341041565, 0.3903144896030426, 0.051697079092264175, 0.873468816280365, 0.9662443995475769, 0.4221976697444916, 0.5376619100570679, 0.38977575302124023, 0.03834615647792816, 0.09812478721141815, 0.31701961159706116, 0.5563293695449829]
   >,
   #Nx.Tensor<
     f32
     3.5150163173675537
   >},
  ...
]

Now create a module which defines your model and objective functions. The form of measure_difference is just the mean squared error between input and label.
现在创建一个定义模型和目标函数的模块。 measure_difference 的形式只是输入和标签之间的均方误差。

defmodule Learn do
  import Nx.Defn
  defn f(params, input) do
    params
    |> Nx.dot(input)
  end
  defn model(params, input) do
    prediction = f(params, input)
    prediction
  end
  defn measure_difference(prediction, label) do
    prediction
    |> Nx.subtract(label)
    |> Nx.power(2)
    |> Nx.mean()
  end
  defn objective(params, input, label) do
    prediction = model(params, input)
    measure_difference(prediction, label)
  end
  defn step(params, input, label) do
    direction_of_descent = grad(params, &objective(&1, input, label))
    params - direction_of_descent * 1.0e-3
  end
end

        
          
        
      
{:module, Learn, <<70, 79, 82, 49, 0, 0, 19, ...>>, {:step, 3}}

Now to actually run the gradient descent, we need an initial set of parameters:
现在要实际运行梯度下降,我们需要一组初始参数:

params = Nx.random_normal({16})

        
          
        
      
#Nx.Tensor<
  f32[16]
  [-0.7124313116073608, -0.08391565829515457, 0.39997708797454834, 0.37166494131088257, -0.8821085095405579, 1.1669548749923706, -0.7852609157562256, -1.3352124691009521, 0.8125752806663513, 0.5284674763679504, 0.4762270152568817, -1.5248165130615234, -0.5238316059112549, -1.1385467052459717, 2.1005051136016846, -1.7426177263259888]
>

We can reduce through the dataset applying step at each observation to slowly learn true_params:
我们可以通过在每次观察时应用 step 的数据集减少来慢慢学习 true_params

recovered_params =
  for _ <- 1..10, reduce: params do
    params ->
      for {input, label} <- train_data, reduce: params do
        params ->
          Learn.step(params, input, label)
      end
  end

        
          
        
      
#Nx.Tensor<
  f32[16]
  [0.45476144552230835, 0.7819427251815796, 0.20503632724285126, 0.20784883201122284, 0.8274795413017273, 0.3705921173095703, 0.6816828846931458, 0.11929456144571304, 0.5488267540931702, 0.9811261892318726, 0.6553477048873901, 0.5045632719993591, 0.6943572163581848, 0.9759575128555298, 0.7533358335494995, 0.455101877450943]
>

Okay, so how well did we do? Let’s measure the error between model and true_f on a number of inputs:
好吧,那我们做得怎么样?让我们测量一些输入上 modeltrue_f 之间的误差:

differences =
  for _ <- 1..1000 do
    x = Nx.random_uniform({16})
    pred = Learn.model(recovered_params, x)
    actual = true_f.(true_params, x)
    Learn.measure_difference(actual, pred)
  end
  |> Nx.tensor()
  |> Nx.mean()

        
          
        
      
#Nx.Tensor<
  f32
  2.6374322864564093e-11
>

The error is nearly 0! Let’s see how this compares to our initial parameters
误差几乎为0!让我们看看这与我们的初始参数相比如何

differences =
  for _ <- 1..1000 do
    x = Nx.random_uniform({16})
    pred = Learn.model(params, x)
    actual = true_f.(true_params, x)
    Learn.measure_difference(actual, pred)
  end
  |> Nx.tensor()
  |> Nx.mean()

        
          
        
      
#Nx.Tensor<
  f32
  38.93861389160156
>

It worked! 有效!

Conclusion 结论

I hope this article offers you a fresh perspective on what it really means for a machine to learn. If you have any feedback or any topics you’d like to see in the future, please don’t hesitate to reach out.
我希望这篇文章能为您提供一个全新的视角,让您了解机器学习的真正含义。如果您有任何反馈或希望在未来看到任何主题,请随时与我们联系。

The Nx team has some exciting things planned for the future. Come chat with us in the EEF ML WG Slack, and be sure to follow myself and DockYard on Twitter to catch all of my traffic.
Nx 团队为未来计划了一些激动人心的事情。来 EEF ML WG Slack 与我们聊天,并确保在 Twitter 上关注我和 DockYard 以获取我的所有流量。

Elixir versus Python for Data Science
用于数据科学的 Elixir 与 Python

A copper scale against a gray background

Machine Learning Advisor
机器学习顾问

Sean Moriarity  肖恩·莫里亚蒂

Introduction 介绍

Over the last year, thanks to the efforts of the amazing Elixir community, the Elixir machine learning ecosystem has grown at an impressive rate, with more and more libraries filling more and more gaps in the Elixir data science and machine learning ecosystem.
在过去的一年里,得益于惊人的 Elixir 社区的努力,Elixir 机器学习生态系统以惊人的速度发展,越来越多的库填补了 Elixir 数据科学和机器学习生态系统中越来越多的空白。

A common argument against using Nx for a new machine learning project is its perceived lack of a library/support for some common task that is available in Python. In this post, I’ll do my best to highlight areas where this is not the case, and compare and contrast Elixir projects with their Python equivalents. Additionally, I’ll discuss areas where the Elixir ecosystem still comes up short, and using Nx for a new project might not be the best idea.
反对将 Nx 用于新机器学习项目的一个常见论据是,人们认为它缺乏对 Python 中可用的某些常见任务的库/支持。在这篇文章中,我将尽我最大的努力强调情况并非如此的领域,并将 Elixir 项目与其 Python 等价物进行比较和对比。此外,我将讨论 Elixir 生态系统仍然存在不足的领域,并且将 Nx 用于新项目可能不是最好的主意。

Numerical Computing 数值计算

The obvious place to start this post is to compare Nx to its Python equivalents. At its core, Nx is intended to serve as a NumPy equivalent with support for automatic differentiation and acceleration via GPUs. In this respect, its main inspiration is JAX—a Python library that supports automatic differentiation and JIT compilation to accelerators via composable function transformations.
这篇文章的开头显然是将 Nx 与其 Python 等价物进行比较。 Nx 的核心是作为 NumPy 的等价物,支持通过 GPU 进行自动微分和加速。在这方面,它的主要灵感来自 JAX —— 一个 Python 库,它通过可组合的函数转换支持自动微分和 JIT 编译到加速器。

The Nx API is (intentionally) considerably smaller than the NumPy API. Because Nx relies on JIT compilation, the API builds around a smaller amount of powerful primitive operations which can be used to build out more complex functions.
Nx API(有意)比 NumPy API 小得多。由于 Nx 依赖于 JIT 编译,因此 API 围绕少量强大的原始操作构建,这些操作可用于构建更复杂的功能。

NumPy does not have the same luxury, instead needing to rely on specialized implementations for most of the functions in its API. JAX also builds on a set of core primitive operations; however, they intentionally provide wrappers around the NumPy API due to its ubiquity in numerical computing community.
NumPy 没有同样的优势,它的 API 中的大部分功能都需要依赖专门的实现。 JAX 还建立在一组核心原语操作之上;但是,由于 NumPy API 在数值计算社区中无处不在,因此他们有意提供围绕 NumPy API 的包装器。

There are pros and cons to having a smaller API. From a learning perspective, a beginner can reasonably pick up and understand 90% of the functions in the Nx API rather quickly.
拥有较小的 API 有利也有弊。从学习的角度来看,初学者可以相当快地掌握和理解 Nx API 中 90% 的功能。

Unfortunately, this trade-off means Nx implementations can at times be more verbose than their NumPy counterparts. Due to the sheer size of the API, there are often times when a few lines of NumPy translate to considerably more lines in Nx. Additionally, the Nx API falls short of the NumPy API in some areas. For example, Nx PRNG support is not as feature complete as JAX/NumPy, the Nx Linear Algebra module is not as in-depth as NumPy, and Nx does not have support for string data types.
不幸的是,这种权衡意味着 Nx 的实现有时会比 NumPy 的实现更加冗长。由于 API 的庞大规模,有时 NumPy 的几行代码会转化为 Nx 中的更多行代码。此外,Nx API 在某些方面不如 NumPy API。例如,Nx PRNG 支持不如 JAX/NumPy 功能完整,Nx 线性代数模块不如 NumPy 深入,Nx 不支持字符串数据类型。

The API shortcomings of Nx are mostly active areas of work. Even with these shortcomings, I’ve found I can be just as productive writing Nx as I can writing NumPy.
Nx 的 API 缺点主要是活跃的工作领域。即使有这些缺点,我发现我写 Nx 的效率和写 NumPy 一样高。

From a performance perspective, if you’re using the EXLA compiler, Nx will have (mostly) equivalent performance to JAX. Nx/EXLA relies on the same JIT compiler as JAX in XLA. That means that essentially all of the areas that JAX beats NumPy, Nx will also beat NumPy; and in all of the areas that NumPy beats JAX, NumPy will also beat Nx.
从性能的角度来看,如果您使用 EXLA 编译器,Nx 将具有(大部分)与 JAX 相当的性能。 Nx/EXLA 依赖于与 XLA 中的 JAX 相同的 JIT 编译器。这意味着基本上 JAX 击败 NumPy 的所有领域,Nx 也将击败 NumPy;在 NumPy 击败 JAX 的所有领域,NumPy 也将击败 Nx。

One advantage that Nx has over JAX is its first-class support for pluggable compilers and backends. While the JAX project seems to be moving in the direction of supporting multiple pluggable compilers/runtimes, Nx was built with this flexibility in mind, and thus is positioned for rapid integration with any existing/future tensor compilers and backends.
Nx 优于 JAX 的一个优势是它对可插入编译器和后端的一流支持。虽然 JAX 项目似乎正朝着支持多个可插入编译器/运行时的方向发展,但 Nx 在构建时就考虑到了这种灵活性,因此可以与任何现有/未来的张量编译器和后端快速集成。

JAX is ahead in terms of parallelization; however, there are plans to integrate parallel primitives into the Nx API on the roadmap. Given that Nx can build on the same parallelism/sharding implementations as JAX in XLA, Nx can catch up to JAX relatively quickly in this respect.
JAX 在并行化方面领先;但是,有计划将并行原语集成到路线图上的 Nx API 中。鉴于 Nx 可以构建在与 XLA 中的 JAX 相同的并行/分片实现之上,Nx 可以在这方面相对较快地赶上 JAX。

Deep Learning 深度学习

One of the initial ambitions for the Nx project was to support creating and training deep-learning-type models in Elixir. This is now possible with the Axon library.
Nx 项目的最初目标之一是支持在 Elixir 中创建和训练深度学习类型的模型。这现在可以通过 Axon 库实现。

Axon is built directly on top of Nx and thus can take advantage of all of the things Nx offers, including JIT compilation and automatic differentiation. Axon most directly compares to tools like PyTorch and TensorFlow/Keras in the Python ecosystem.
Axon 直接构建在 Nx 之上,因此可以利用 Nx 提供的所有功能,包括 JIT 编译和自动微分。 Axon 最直接地与 Python 生态系统中的 PyTorch 和 TensorFlow/Keras 等工具进行比较。

From an API perspective, Axon is somewhat even with both PyTorch and TensorFlow/Keras. Aside from Attention/Transformer layers (which are on the roadmap), Axon has an essentially identical offering of model building blocks. Additionally, with its custom layer API, creating and using new layers is as easy as defining an Nx implementation of the layer. More or less any model you can create and perform inference in PyTorch/TensorFlow, you can also create and perform inference in with Axon.
从 API 的角度来看,Axon 在某种程度上与 PyTorch 和 TensorFlow/Keras 持平。除了 Attention/Transformer 层(在路线图上)之外,Axon 提供了基本相同的模型构建块。此外,借助其自定义层 API,创建和使用新层就像定义层的 Nx 实现一样简单。或多或少,您可以在 PyTorch/TensorFlow 中创建和执行推理的任何模型,您也可以使用 Axon 创建和执行推理。

Axon also has a robust training API inspired by libraries in the Python ecosystem such as PyTorch Ignite and PyTorch Lightning. The training API supports out-of-the-box callbacks such as model checkpoints, early stopping, and model validation, as well as an API for integrating custom callbacks. Similar to Keras, the Axon training API offers increasing levels of flexibility at increasing levels of complexity. In other words, you can sacrifice simplicity to have more control over training your models.
Axon 还有一个强大的训练 API,其灵感来自 Python 生态系统中的库,例如 PyTorch Ignite 和 PyTorch Lightning。训练 API 支持开箱即用的回调,例如模型检查点、提前停止和模型验证,以及用于集成自定义回调的 API。与 Keras 类似,Axon 训练 API 在增加复杂性的同时提供了更高级别的灵活性。换句话说,您可以牺牲简单性来更好地控制模型的训练。

One area of concern when migrating to Elixir is the ability to make use of pre-trained models. Thanks to AxonOnnx this is possible for (almost) any model you might have.
迁移到 Elixir 时需要关注的一个方面是使用预训练模型的能力。感谢 AxonOnnx,这对于(几乎)您可能拥有的任何模型都是可能的。

If you’re able to export an ONNX version of your model (e.g using torch.onnx or tf2onnx), you can probably import your model with Axon. AxonOnnx has even been tested to work with pre-trained transformers from the popular transformers library.
如果您能够导出模型的 ONNX 版本(例如使用 torch.onnx 或 tf2onnx),您可能可以使用 Axon 导入模型。 AxonOnnx 甚至已经过测试,可以与流行的变形金刚库中的预训练变形金刚一起使用。

While Axon supports importing pre-trained models, there are still some aspects of working with pre-trained models that need ironing out. For example, fine-tuning, while possible, does not yet have a first-class API in Axon. Additionally, features such as mixed-precision and multi-device training that make training large models possible are not 100% supported in Axon yet.
虽然 Axon 支持导入预训练模型,但在使用预训练模型时仍有一些方面需要解决。例如,微调虽然可能,但在 Axon 中还没有一流的 API。此外,Axon 尚未 100% 支持诸如混合精度和多设备训练等使训练大型模型成为可能的功能。

Traditional Machine Learning
传统机器学习

Along with deep learning, gradient boosting and decision tree algorithms are perhaps the most popular machine learning algorithms in use. These classes of algorithms typically outperform deep learning with tabular and time-series data, and are often significantly less expensive to train and deploy.
与深度学习一起,梯度提升和决策树算法可能是最流行的机器学习算法。这些类别的算法通常优于使用表格和时间序列数据的深度学习,并且通常训练和部署的成本要低得多。

Unfortunately, this is an area still under active development in the Elixir ecosystem. Python has popular libraries such as XGBoost, but there is still no Elixir equivalent. I expect this to change over the next six months; however, for the time being, Elixir is behind in this area.
不幸的是,这是 Elixir 生态系统中仍在积极开发的领域。 Python 有 XGBoost 等流行库,但仍然没有 Elixir 等价物。我预计这种情况会在接下来的六个月内有所改变;然而,目前,Elixir 在这方面落后了。

Elixir also falls short of Python in other traditional machine learning applications. While Python has the excellent scikit-learn, Elixir has the relatively new Scholar library. Because Scholar is new, it’s lacking in features that allow it to be a competitive alternative to sklearn. This is another area of active development on the Nx roadmap, and thus I expect things to look significantly different here in the next six months.
在其他传统机器学习应用程序中,Elixir 也不及 Python。 Python 拥有出色的 scikit-learn,而 Elixir 拥有相对较新的 Scholar 库。因为 Scholar 是新的,所以它缺乏使其成为 sklearn 竞争替代品的功能。这是 Nx 路线图上另一个积极发展的领域,因此我预计未来六个月这里的情况会大不相同。

Data Analysis 数据分析

Essentially any data scientist that has worked with Python for any amount of time is familiar with the pandas library for data analysis. Pandas is a library for working with structured, columnar data. It’s popular as a library for any sort of analysis or munging tasks you might need to perform. The Elixir equivalent to Pandas is Explorer. Explorer is built on top of the polars library which implements DataFrames in Rust.
基本上任何使用过 Python 一段时间的数据科学家都熟悉用于数据分析的 pandas 库。 Pandas 是一个用于处理结构化列式数据的库。它作为一个库很受欢迎,可用于您可能需要执行的任何类型的分析或处理任务。与 Pandas 相当的 Elixir 是 Explorer。 Explorer 建立在 polars 库之上,该库在 Rust 中实现了 DataFrame。

From an API perspective, the Explorer API is different from what you might be used to using with Python. Given that Elixir is a functional language, the Explorer library builds on immutable abstractions, which can feel quite different for somebody migrating from Python and pandas’ mutability. Explorer, like Nx, is notably more succinct than its Python counterpart. Despite this, there isn’t much you can’t do in Explorer that you can do in Pandas.
从 API 的角度来看,Explorer API 与您可能习惯使用 Python 的 API 不同。鉴于 Elixir 是一种函数式语言,Explorer 库建立在不可变的抽象之上,对于从 Python 和 pandas 的可变性迁移过来的人来说,这可能会有很大的不同。与 Nx 一样,Explorer 明显比对应的 Python 版本更简洁。尽管如此,您在 Explorer 中不能做的事情并没有在 Pandas 中做不到。

From a performance perspective, Explorer benefits from the speed of Polars. There are a number of articles that laud Polars as the fastest DataFrame library. Given that Explorer builds on that performance, you might see significant performance improvements migrating from Pandas to Explorer.
从性能的角度来看,Explorer 受益于 Polars 的速度。有许多文章称赞 Polars 是最快的 DataFrame 库。鉴于 Explorer 建立在该性能之上,您可能会看到从 Pandas 迁移到 Explorer 的显着性能改进。

Data Presentation/Visualization
数据呈现/可视化

In data science, presentations and visualizations are where the money is made. Having a good tool for presenting and visualizing data is a must for any language looking to position itself in the data science sphere. Python has a number of excellent visualization libraries such as Plotly Express and matplotlib. The Elixir equivalent is its VegaLite library which provides bindings around the VegaLite graphics library.
在数据科学中,演示和可视化是赚钱的地方。对于任何希望在数据科学领域定位自己的语言来说,拥有一个用于呈现和可视化数据的好工具都是必须的。 Python 有很多优秀的可视化库,比如 Plotly Express 和 matplotlib。 Elixir 的等价物是它的 VegaLite 库,它提供围绕 VegaLite 图形库的绑定。

Functionally, you can get essentially equivalent visualizations from both Elixir and Python. The VegaLite API might feel unfamiliar to users coming from Plotly Express and matplotlib; however, the abstractions are incredibly powerful and allow for composing and creating evermore complex graphics with code.
在功能上,您可以从 Elixir 和 Python 获得本质上相同的可视化效果。来自 Plotly Express 和 matplotlib 的用户可能对 VegaLite API 感到陌生;然而,抽象非常强大,可以用代码组合和创建更复杂的图形。

For the most part, I’ve found it possible to perform equivalent visualizations in both Elixir and Python; however, Python seems to have an edge in network visualizations and geographic visualizations. Elixir has no equivalents to Python’s NetworkX and the Folium library. I suspect with companies like Felt using Elixir in the map-making space that we might see geographic visualizations in Elixir improve (fingers crossed).
在大多数情况下,我发现可以在 Elixir 和 Python 中执行等效的可视化;然而,Python 似乎在网络可视化和地理可视化方面具有优势。 Elixir 没有 Python 的 NetworkX 和 Folium 库的等价物。我怀疑像 Felt 这样的公司在地图制作领域使用 Elixir,我们可能会看到 Elixir 中的地理可视化得到改善(祈祷)。

The Python ecosystem also has a number of libraries concerned with Dashboard creation. Tools such as Dash allow for the creation of interactive demos with a few lines of code. There are no direct equivalents in Elixir just yet; however, the direction of Livebook is promising for the prospects of interactive and shareable demos in Elixir.
Python 生态系统还有许多与仪表板创建有关的库。 Dash 等工具允许使用几行代码创建交互式演示。目前在 Elixir 中还没有直接的等价物;然而,Livebook 的方向对于 Elixir 中交互式和可共享演示的前景很有希望。

Pipelines / Orchestration
管道/编排

Whether it be training large models or creating production-ready data ingest/management solutions, the task of data orchestration and pipeline is an important one for data science/machine learning. There are a large number of Python libraries built specifically to create and orechestrate data pipelines. In Elixir, there are a few; however, this is an area I would personally argue Elixir has a strong edge over Python. Given the Elixir is built on the BEAM, which is designed for concurrency, the task of Concurrent Data Processing in Elixir is a natural extension of the language. Python is just not designed with concurrency in mind. From simple language-level abstractions such as Task to library-level abstractions such as Flow and Broadway, creating scalable input processing pipelines is incredibly easy with Elixir by default.
无论是训练大型模型还是创建生产就绪的数据摄取/管理解决方案,数据编排和管道的任务都是数据科学/机器学习的重要任务。有大量专门用于创建和编排数据管道的 Python 库。在 Elixir 中,有一些;然而,这是我个人认为 Elixir 比 Python 更有优势的领域。鉴于 Elixir 建立在为并发而设计的 BEAM 之上,Elixir 中的并发数据处理任务是该语言的自然扩展。 Python 在设计时并未考虑并发性。从简单的语言级抽象(如 Task )到库级抽象(如 FlowBroadway ),默认情况下使用 Elixir 创建可扩展的输入处理管道非常容易。

That’s not to say that Python doesn’t have some nice libraries for achieving the same results. Both PyTorch and TensorFlow offer nice data loading abstractions in tf.data and DataLoader. Additionally, there are a number of libraries designed for building and orchestrating data pipelines at scale (e.g. Prefect and AirFlow). The biggest advantage Elixir has in this space is that it is concurrent and fault-tolerant by design. I don’t think Python can ever beat Elixir in this regard.
这并不是说 Python 没有一些不错的库来实现相同的结果。 PyTorch 和 TensorFlow 都在 tf.dataDataLoader 中提供了很好的数据加载抽象。此外,还有许多专为大规模构建和编排数据管道而设计的库(例如 Prefect 和 AirFlow)。 Elixir 在这个领域的最大优势是它在设计上是并发和容错的。我认为 Python 在这方面永远无法击败 Elixir。

Domain-Specific Libraries
特定领域的图书馆

There are a number of “domain-specific” libraries that don’t neatly fall into any of the categories I’ve written so far, but which are worth a brief mention in this article—namely computer vision, natural language processing, and signal processing.
有许多“特定领域”的库并不完全属于我目前所写的任何类别,但值得在本文中简要提及 —— 即计算机视觉、自然语言处理和信号加工。

There are a number of computer vision-related libraries in the Python ecosystem that generally streamline the task of working with images. This includes Pillow and OpenCV among others. In the Elixir ecosystem, there is Evision, which provides bindings to OpenCV implementations. From both an API and performance perspective, this means working with images in Evision will be somewhat similar to working with images in Python’s OpenCV bindings.
Python 生态系统中有许多与计算机视觉相关的库,它们通常可以简化处理图像的任务。这包括 Pillow 和 OpenCV 等。在 Elixir 生态系统中,有 Evision,它提供与 OpenCV 实现的绑定。从 API 和性能的角度来看,这意味着在 Evision 中处理图像与在 Python 的 OpenCV 绑定中处理图像有些相似。

For NLP tasks, Elixir does not have a library that is equivalent to Python’s spacy or NLTK. However, it does offer bindings to Huggingface’s tokenizers, and the ability to import a significant number of HuggingFace models for performing NLP tasks with neural networks.
对于 NLP 任务,Elixir 没有相当于 Python 的 spacy 或 NLTK 的库。但是,它确实提供了与 Huggingface 分词器的绑定,并且能够导入大量 HuggingFace 模型以使用神经网络执行 NLP 任务。

The Elixir ecosystem still falls behind Python in the area of signal processing; however, with recent work to add FFT support to Nx, and other dedicated efforts, I expect this area to improve in the near future.
Elixir 生态系统在信号处理方面仍然落后于 Python;然而,随着最近为 Nx 添加 FFT 支持的工作以及其他专门的努力,我希望这个领域在不久的将来会有所改善。

Conclusion 结论

This post was meant to serve as a high-level comparison of the Elixir machine learning and data science ecosystem with the Python ecosystem. While there are still many gaps in the Elixir ecosystem, the progress over the last year has been rapid. Almost every library I’ve mentioned in this post is less than two years old, and I suspect there will be many more projects to fill some of the gaps I’ve mentioned in the coming months.
这篇文章旨在对 Elixir 机器学习和数据科学生态系统与 Python 生态系统进行高级比较。虽然 Elixir 生态系统仍然存在许多差距,但过去一年的进展非常迅速。我在这篇文章中提到的几乎每个库都不到两年,我怀疑在接下来的几个月里会有更多的项目来填补我提到的一些空白。

If you’re interested in helping in any of our areas of active development, join us at the EEF ML Working Group to drive machine learning on the BEAM forward. Until next time!
如果您有兴趣在我们积极开发的任何领域提供帮助,请加入我们的 EEF ML 工作组,以推动 BEAM 上的机器学习向前发展。直到下一次!

Why Should I use Axon?
我为什么要使用 Axon?

The framework of a steel bridge

Machine Learning Advisor
机器学习顾问

Sean Moriarity  肖恩·莫里亚蒂

During my talk for ElixirConf 2022, my goal was to convey that Axon is a production-ready deep learning framework. In this post, I’ll re-hash some of those same points, and do my best to make the case for using Nx and Axon in a production machine learning project.
在 ElixirConf 2022 的演讲中,我的目标是传达 Axon 是一个生产就绪的深度学习框架。在这篇文章中,我将重新讨论其中一些相同的观点,并尽我所能为在生产机器学习项目中使用 Nx 和 Axon 提供理由。

What is Axon? 什么是轴突?

Axon is a deep learning framework for the Elixir ecosystem that offers a simple and straightforward means to create and train neural networks. Axon is similar to frameworks like PyTorch and TensorFlow from the Python ecosystem. You can check out some of my posts from the DockYard blog to see examples of Axon in action.
Axon 是 Elixir 生态系统的深度学习框架,它提供了一种简单直接的方法来创建和训练神经网络。 Axon 类似于 Python 生态系统中的 PyTorch 和 TensorFlow 等框架。您可以在 DockYard 博客中查看我的一些帖子,以查看 Axon 的实际应用示例。

What is deep learning? 什么是深度学习?

Before considering how Axon can help you in a production project, you need to understand how deep learning can help you.
在考虑 Axon 如何在生产项目中为您提供帮助之前,您需要了解深度学习如何为您提供帮助。

Deep learning is a subset of machine learning based on artificial neural networks. Neural networks are somewhat pseudo-biologically inspired. Neural networks don’t really work like the human brain, at least in ways that we understand.
深度学习是基于人工神经网络的机器学习的一个子集。神经网络在某种程度上受到了伪生物学的启发。神经网络并不真正像人脑那样工作,至少在我们理解的方式上是这样。

In reality, neural networks make use of composed linear and non-linear transformations to learn hierarchical representations of input data. As it turns out, this relatively simple approach to modeling input data, combined with the unreasonable effectiveness of gradient descent, can yield incredible results on a variety of tasks.
实际上,神经网络利用组合的线性和非线性变换来学习输入数据的层次表示。事实证明,这种相对简单的输入数据建模方法,结合梯度下降的不合理有效性,可以在各种任务上产生令人难以置信的结果。

The chances are in the last few years you’ve seen and directly interacted with some incredible innovations that are byproducts of deep learning. GPT-3, DALL-E 2, and Stable Diffusion are incredibly powerful and incredibly popular deep learning models.
在过去的几年中,您有机会看到并直接接触到一些令人难以置信的创新,这些创新是深度学习的副产品。 GPT-3、DALL-E 2 和 Stable Diffusion 是非常强大且非常受欢迎的深度学习模型。

Deep learning has essentially broken every expectation of what is technically possible in the realm of artificial intelligence and machine learning. A byproduct of this is the emergence of startups and companies focused on building an AI-centric future.
深度学习基本上打破了人工智能和机器学习领域技术上可能实现的所有期望。其副产品是专注于构建以 AI 为中心的未来的初创公司和公司的出现。

Many experts are focused on the debate over whether or not the current trajectory of deep learning will lead to artificial general intelligence (AGI). I believe this debate is a red herring. I am cautiously optimistic that deep learning has reached a point where it’s capable of solving challenging open-ended problems across a broad range of industries. The value proposition of using deep learning to some extent in your applications is approaching a point that is too great to ignore.
许多专家都在争论深度学习的当前轨迹是否会导致通用人工智能 (AGI)。我认为这场辩论是转移注意力。我对深度学习已经达到能够解决广泛行业中具有挑战性的开放式问题的地步持谨慎乐观的态度。在您的应用程序中在某种程度上使用深度学习的价值主张正在接近一个不容忽视的点。

What are the benefits of Axon?
轴突有什么好处?

Axon as a deep learning framework stacks up closely to both PyTorch and TensorFlow. From a feature-set perspective, Axon reaches near feature parity* with both PyTorch and TensorFlow. There are some notable limitations–namely mixed precision training and distributed training–however, both of these are on the roadmap and will more than likely be implemented and tested within the next few months.
Axon 作为一个深度学习框架,与 PyTorch 和 TensorFlow 紧密相连。从功能集的角度来看,Axon 与 PyTorch 和 TensorFlow 的功能接近*。有一些明显的限制 —— 即混合精度训练和分布式训练 —— 但是,这两者都在路线图上,很可能会在未来几个月内实施和测试。

If you’re transitioning from using PyTorch and TensorFlow, Axon will feel very familiar to both. This is intentional. Axon’s API is designed to mirror the syntax and semantics of both frameworks–with some exceptions to account for Elixir’s functional style. The barrier to entry in learning Axon as a programmer familiar with Python’s deep learning frameworks is relatively low.
如果您正在从使用 PyTorch 和 TensorFlow 过渡,Axon 会对两者都非常熟悉。这是故意的。 Axon 的 API 旨在反映这两个框架的语法和语义 —— 除了一些例外,以说明 Elixir 的功能风格。作为熟悉 Python 深度学习框架的程序员,学习 Axon 的入门门槛相对较低。

Given the similarities of Axon to PyTorch and TensorFlow, what differentiates Axon? One of the most compelling cases for Axon is that Axon is capable of seamlessly scaling up or down to meet your production needs.
鉴于 Axon 与 PyTorch 和 TensorFlow 的相似之处,Axon 有何不同之处? Axon 最引人注目的案例之一是 Axon 能够无缝地向上或向下扩展以满足您的生产需求。

As a byproduct of building on top of Nx’s flexible runtime options, you can use the same deployment interface in Axon when targeting mobile, edge, and server deployments–all you need to change is the Nx backend or compiler. Pair this with a language like Elixir, which also scales up or down to meet your production needs, and you can develop for mobile (Elixir Desktop), edge (Nerves), and server (Phoenix) deployments with the same stack. The Python ecosystem is much more fragmented in this regard. There are a number of deployment runtimes and tools that are designed to overcome the shortcomings of the language.
作为建立在 Nx 灵活的运行时选项之上的副产品,您可以在针对移动、边缘和服务器部署时使用 Axon 中的相同部署接口 —— 您需要更改的只是 Nx 后端或编译器。将其与像 Elixir 这样的语言配对,它也可以按比例放大或缩小以满足您的生产需求,您可以使用相同的堆栈为移动(Elixir 桌面)、边缘(Nerves)和服务器(Phoenix)部署进行开发。 Python 生态系统在这方面更加分散。有许多部署运行时和工具旨在克服该语言的缺点。

Using Axon also opens you up to the benefits of using Elixir across the entire machine learning operations lifecycle. Some of Elixir and Erlang/OTP’s best features for designing robust, scalable, and fault-tolerant applications are also unintentional strengths that can benefit you at every step in the lifecycle of a machine learning deployment.
使用 Axon 还可以让您了解在整个机器学习操作生命周期中使用 Elixir 的好处。 Elixir 和 Erlang/OTP 的一些用于设计健壮、可扩展和容错应用程序的最佳功能也是无意中的优势,可以在机器学习部署生命周期的每一步中使您受益。

One of the strongest arguments against using Nx and Axon is the lack of maturity in the ecosystem. This is a valid concern. After all, by choosing to use Nx and Axon, you are, in some ways, forgoing the ability to use familiar tools like Pandas, Spark, Airflow, Prefect, and more. However, depending on your use case, you will likely find an analagous tool available in the Elixir ecosystem. Or, you might find that using Elixir completely eliminates the need for an existing solution.
反对使用 Nx 和 Axon 的最有力论据之一是生态系统缺乏成熟度。这是一个合理的担忧。毕竟,通过选择使用 Nx 和 Axon,在某些方面,您将放弃使用 Pandas、Spark、Airflow、Prefect 等熟悉的工具的能力。但是,根据您的用例,您可能会在 Elixir 生态系统中找到可用的类似工具。或者,您可能会发现使用 Elixir 完全消除了对现有解决方案的需求。

Why should I use Axon if…?
如果……我为什么要使用 Axon?

Generally speaking, there are four permutations to the question “Why should I use Axon?”.
一般来说,“我为什么要使用 Axon?”这个问题有四种排列方式。

  1. Why should I use Axon if my application uses Elixir, and I have a machine learning need?
    如果我的应用程序使用 Elixir,并且我有机器学习需求,为什么还要使用 Axon?
  2. Why should I use Axon if my application uses Elixir, but I don’t have a machine learning need?
    如果我的应用程序使用 Elixir,但我没有机器学习需求,为什么还要使用 Axon?
  3. Why should I use Axon if my application doesn’t use Elixir, but I have a machine learning need?
    如果我的应用程序不使用 Elixir,但我有机器学习需求,为什么要使用 Axon?
  4. Why should I use Axon if my application doesn’t use Elixir, and I don’t have a machine learning need?
    如果我的应用程序不使用 Elixir,而且我没有机器学习需求,我为什么要使用 Axon?

I will do my best to answer each one.
我会尽力回答每一个。

1. My app uses Elixir and machine learning
1. 我的应用使用 Elixir 和机器学习

If you’re reading this article, it’s likely you are firmly in this camp. You’re already using Elixir in some or all of your development stack, and your application makes use of machine learning.
如果您正在阅读本文,那么您很可能坚定地站在这个阵营中。您已经在部分或全部开发堆栈中使用 Elixir,并且您的应用程序使用了机器学习。

The most compelling benefit of making the switch to using Axon and Nx is that you can avoid fragmented machine learning workflows.
改用 Axon 和 Nx 的最引人注目的好处是您可以避免分散的机器学习工作流程。

One of the biggest challenges of MLOps is gluing everything together in production. When your workflows are fragmented across languages and enterprise solutions, gluing feels more like duct taping. Oftentimes this leads teams to reach for enterprise-grade solutions for dataflow engineering such as Spark and Prefect. However, I believe if you make the switch to a full-Elixir stack you might be able to completely eliminate the need for these solutions in production.
MLOps 的最大挑战之一是在生产中将所有内容粘合在一起。当您的工作流程分散在不同语言和企业解决方案中时,粘合感觉更像是管道胶带。通常,这会导致团队寻求用于数据流工程的企业级解决方案,例如 Spark 和 Prefect。但是,我相信如果您切换到完整的 Elixir 堆栈,您可能能够完全消除生产中对这些解决方案的需求。

If you have an existing data science and machine learning team comfortable working in Python, it’s both costly and difficult to compel them to learn a new language. Fortunately, one of Axon’s development priorities is portability. Using a solution like AxonOnnx (and some future projects still to come), you can easily export most models from Python and import them directly to Elixir. We are constantly adding support for more models, so if you find your model is unsupported, please reach out or open an issue.
如果您现有的数据科学和机器学习团队习惯于使用 Python,那么强迫他们学习一门新语言既昂贵又困难。幸运的是,Axon 的开发重点之一是便携性。使用像 AxonOnnx 这样的解决方案(以及一些未来的项目),您可以轻松地从 Python 导出大多数模型并将它们直接导入 Elixir。我们不断增加对更多模型的支持,因此如果您发现您的模型不受支持,请联系我们或提出问题。

2. My app uses Elixir, but not machine learning
2. 我的应用使用了 Elixir,但没有使用机器学习

If you have an application that is already using Elixir but doesn’t have an immediate machine learning need, then it doesn’t necessarily make sense to search for one so you can use Axon. I am a firm believer that you should not use machine learning unless it’s absolutely necessary.
如果您的应用程序已经在使用 Elixir 但没有直接的机器学习需求,那么搜索一个应用程序以便使用 Axon 不一定有意义。我坚信除非绝对必要,否则你不应该使用机器学习。

On the flip side, I am also a firm believer that there are applications of machine learning everywhere. As I mentioned at the beginning of this article, deep learning is eating the world. While I don’t love the idea of using deep learning for deep learning’s sake, I do believe that the value proposition of integrating deep learning into existing applications is getting too high to ignore.
另一方面,我也坚信机器学习的应用无处不在。正如我在本文开头提到的,深度学习正在吞噬世界。虽然我不喜欢为了深度学习而使用深度学习的想法,但我确实相信将深度学习集成到现有应用程序中的价值主张已经变得太高了,不容忽视。

3. My app doesn’t use Elixir but does use machine learning
3. 我的应用没有使用 Elixir 但使用了机器学习

For those in this group, it’s difficult to make the case that you should upend your entire workflow and start using Elixir for everything. The cost of completely replacing your stack with Elixir is likely too high for you to feel comfortable making that decision.
对于这个群体中的人来说,很难证明你应该颠覆你的整个工作流程并开始使用 Elixir 来处理所有事情。用 Elixir 完全替换您的堆栈的成本可能太高,您无法放心做出该决定。

Rather than upending your entire stack, I would encourage you to investigate parts of your machine learning operations cycle that can benefit from using of Elixir. Perhaps the lowest friction place to start is replacing some of your workflow automation and dataflow engineering with an Elixir solution.
与其颠覆你的整个堆栈,我鼓励你调查你的机器学习操作周期中可以从使用 Elixir 中获益的部分。也许摩擦最小的起点是用 Elixir 解决方案替换您的一些工作流自动化和数据流工程。

Of course, you might find that completely making the switch isn’t as difficult or as costly as you think. There are success stories in the wild (see Amplified) of companies switching from a fragmented stack to a 100% Elixir stack for their machine learning products, and experiencing the cost and time benefits of working in a full Elixir stack.
当然,您可能会发现完全转换并不像您想象的那么困难或昂贵。有许多公司的成功案例(请参阅 Amplified),他们的机器学习产品从碎片化堆栈切换到 100% Elixir 堆栈,并体验了在完整 Elixir 堆栈中工作的成本和时间优势。

4. My app doesn’t use Elixir or machine learning
4. 我的应用不使用 Elixir 或机器学习

Similar to group #3, it’s likely you’re not ready to commit to upending your stack in favor of Elixir.
与第 3 组类似,您可能还没有准备好承诺颠覆您的堆栈以支持 Elixir。

However, similar to group #2 there may be a compelling use case for machine learning in your future. If you find that it makes sense to start integrating some intelligent components into your application, the cost of choosing to start with Nx and Axon is low.
然而,与第 2 组类似,未来可能会有一个令人信服的机器学习用例。如果您发现开始将一些智能组件集成到您的应用程序中是有意义的,那么选择从 Nx 和 Axon 开始的成本很低。

You might find it easier to integrate Elixir into your existing workflows than you would some of the tools in the Python ecosystem. Again, you shouldn’t seek out a machine learning need if one doesn’t exist, but you also shouldn’t ignore the possibilities of what you can accomplish with the power of machine learning today.
与 Python 生态系统中的某些工具相比,您可能会发现将 Elixir 集成到现有工作流程中更容易。同样,如果机器学习需求不存在,您不应该去寻找,但您也不应该忽视当今利用机器学习的力量可以完成的事情的可能性。

Conclusion 结论

I hope this article leads you to think about integrating Nx and Axon into your application. One of my goals for the coming year is to prove the efficacy of the Nx ecosystem in production environments.
我希望本文能引导您考虑将 Nx 和 Axon 集成到您的应用程序中。我来年的目标之一是证明 Nx 生态系统在生产环境中的有效性。

If you have a machine learning use case and a desire to start with Nx or Axon, don’t hesitate to reach out. If you’re interested in learning more about Nx and Axon, check out my other blog posts.
如果您有机器学习用例并希望从 Nx 或 Axon 入手,请随时联系我们。如果您有兴趣了解更多关于 Nx 和 Axon 的信息,请查看我的其他博文。

Lastly, if you want to help shape the future of the Nx ecosystem, come join us in the EEF ML Working Group.
最后,如果您想帮助塑造 Nx 生态系统的未来,请加入我们的 EEF ML 工作组。

Until next time! 直到下一次!

Semantic Search with Phoenix, Axon, and Elastic
使用 Phoenix、Axon 和 Elastic 进行语义搜索

A store shelf with bottles of wine and price tags

Machine Learning Advisor
机器学习顾问

Sean Moriarity  肖恩·莫里亚蒂

Introduction 介绍

Recent advancements in the field of natural language processing have produced models seemingly capable of capturing the semantic meaning of text.
自然语言处理领域的最新进展产生了似乎能够捕捉文本语义的模型。

Transformer-based language models are a type of neural network architecture capable of modeling complex relationships in natural language. Google introduced the original transformer model, BERT, in 2017 and it has played a fundamental role in their search algorithm since its inception.
基于 Transformer 的语言模型是一种神经网络架构,能够对自然语言中的复杂关系进行建模。谷歌于 2017 年推出了原始的 Transformer 模型 BERT,自成立以来,它在他们的搜索算法中发挥了基础性作用。

BERT is powerful enough to capture context and meaning. Its ability to effectively model text makes it a natural fit for semantic search. In a semantic search, the goal is to match documents and queries based on context and intent, rather than just matching on keywords. In other words, semantic search allows you to express queries in natural language, and retrieve documents that closely match the intent of your query.
BERT 足够强大,可以捕获上下文和意义。它有效地对文本建模的能力使其非常适合语义搜索。在语义搜索中,目标是根据上下文和意图匹配文档和查询,而不仅仅是匹配关键字。换句话说,语义搜索允许您用自然语言表达查询,并检索与您的查询意图最接近的文档。

Transformer models like BERT are a powerful tool and, fortunately, you can leverage this power directly from Elixir. In this post, I’ll walk you through a simple semantic search application that plays the role of sommelier–matching natural language requests to wines.
像 BERT 这样的 Transformer 模型是一个强大的工具,幸运的是,您可以直接从 Elixir 中利用这种强大的功能。在这篇文章中,我将向您介绍一个简单的语义搜索应用程序,它扮演侍酒师的角色 —— 将自然语言请求与葡萄酒相匹配。

Setting up the application
设置应用程序

Let’s start by creating a new Phoenix application:
让我们从创建一个新的 Phoenix 应用程序开始:

$ mix phx.new wine --no-ecto

Next, navigate here to download a newline delimited JSON (jsonlines) file of wines scraped from Wine.com. You’ll need these later on. Save the scraped wines to a directory of your choosing–I have mine in priv.
接下来,导航到此处以下载从 Wine.com 抓取的葡萄酒的换行符分隔 JSON (jsonlines) 文件。稍后您将需要这些。将刮下来的酒保存到您选择的目录中 —— 我在 priv 中有我的。

Now you’ll want to set up Elasticsearch. If you’re not familiar with Elasticsearch, it is a search and analytics engine that enables integration of search capabilities into existing applications. This application makes use of a local Elasticsearch setup; however, you can easily make use of a hosted instance from AWS or elsewhere.
现在您需要设置 Elasticsearch。如果您不熟悉 Elasticsearch,它是一个搜索和分析引擎,可以将搜索功能集成到现有应用程序中。此应用程序使用本地 Elasticsearch 设置;但是,您可以轻松地使用 AWS 或其他地方的托管实例。

To setup Elasticsearch, start by creating a new docker network:
要设置 Elasticsearch,首先要创建一个新的 docker 网络:

docker network create elastic

Next, start the official Elasticsearch container:
接下来,启动官方的 Elasticsearch 容器:

docker run --name es01 \
  --net elastic \
  -p 9200:9200 \
  -p 9300:9300 \
  -m 4gb \
  -it docker.elastic.co/elasticsearch/elasticsearch:8.3.0

It’s important to use at least version 8.x as that’s the minimum version that supports the functionality required for semantic search. You might encounter issues with vm.max_map_count on startup, you can fix this by raising the value of vm.max_map_count with:
至少使用 8.x 版本很重要,因为这是支持语义搜索所需功能的最低版本。您可能会在启动时遇到 vm.max_map_count 的问题,您可以通过提高 vm.max_map_count 的值来解决此问题:

sysctl -w vm.max_map_count=262144

Or by changing the value in /etc/sysctl.conf.
或者通过更改 /etc/sysctl.conf 中的值。

Once the container starts, you’ll see a bunch of metadata logged including a password for the elastic user. You’ll want to save the password. I’ve got mine stored in an ELASTICSEARCH_PASSWORD environment variable.
容器启动后,您会看到记录了一堆元数据,包括 elastic 用户的密码。您需要保存密码。我已将我的存储在 ELASTICSEARCH_PASSWORD 环境变量中。

Elasticsearch makes use of SSL by default, so you’ll need to grab the certificate file from the running container. You can do this by opening a new terminal and running:
Elasticsearch 默认使用 SSL,因此您需要从正在运行的容器中获取证书文件。您可以通过打开一个新终端并运行以下命令来执行此操作:

docker cp es01:/usr/share/elasticsearch/config/certs/http_ca.crt .

Now you can verify the server is running with:
现在您可以验证服务器是否正在运行:

curl --cacert http_ca.crt -u elastic https://localhost:9200

You will be prompted for a password for the user elastic–this is the password you saved in the previous step. If the query was successful, you should get a 200 response with some metadata about the server:
系统将提示您输入用户 elastic 的密码 —— 这是您在上一步中保存的密码。如果查询成功,您应该会收到一个 200 响应,其中包含有关服务器的一些元数据:

{
  "name" : "a60143889402",
  "cluster_name" : "docker-cluster",
  "cluster_uuid" : "GyaCxRKHSoOFbLXsza7q1A",
  "version" : {
    "number" : "8.3.0",
    "build_type" : "docker",
    "build_hash" : "5b8b981647acdf1ba1d88751646b49d1b461b4cc",
    "build_date" : "2022-06-23T22:48:49.607492124Z",
    "build_snapshot" : false,
    "lucene_version" : "9.2.0",
    "minimum_wire_compatibility_version" : "7.17.0",
    "minimum_index_compatibility_version" : "7.0.0"
  },
  "tagline" : "You Know, for Search"
}

Configuring the Elasticsearch index
配置 Elasticsearch 索引

In Elasticsearch, a collection of documents is called an index.
在 Elasticsearch 中,文档的集合称为索引。

An index is somewhat similar to a relational database. In an index, you define different types (think tables) of documents and define the properties (think columns) of a document.
索引有点类似于关系数据库。在索引中,您可以定义不同类型的文档(如表)并定义文档的属性(如列)。

In this application, you want a user to be able to enter a natural language wine description and receive a list of potential wines back. That means you’ll just need a single “type”, i.e. wines. But what kind of properties does each wine type need to have?
在此应用程序中,您希望用户能够输入自然语言的葡萄酒描述并收到潜在葡萄酒的列表。这意味着您只需要一种“类型”,即葡萄酒。但是每种葡萄酒需要具备什么样的特性呢?

Realistically, you can define any number of properties you’d like; however, the most important property for your wine type is its document-vector.
实际上,您可以定义任意数量的属性;但是,您的葡萄酒类型最重要的属性是它的 document-vector

The document-vector is a dense vector representation of a document. BERT and other transformer models work by embedding text into a dense numerical representation.
document-vector 是文档的密集向量表示。 BERT 和其他转换器模型通过将文本嵌入到密集的数字表示中来工作。

These numerical representations exist in really high-dimensional space, close to other texts with similar meanings. For example, if you were to embed the words “king” and “queen” and visualize where they are in space, you’d notice they’re relatively close together. More than likely, they’d probably be near other words like “royalty”, “throne”, “prince”, and “princess.”
这些数字表示存在于真正的高维空间中,接近于具有相似含义的其他文本。例如,如果你要嵌入“king”和“queen”这两个词并想象它们在空间中的位置,你会注意到它们相对靠近。更有可能的是,它们可能与“皇室”、“王位”、“王子”和“公主”等其他词接近。

BERT can embed long strings of text into high-dimensional space, such that sentences with similar meanings lie close together in space. For example, the embeddings for “I like dogs” and “I like puppies” should be very similar as the meanings of the sentences are essentially identical.
BERT 可以将长串文本嵌入到高维空间中,使得具有相似含义的句子在空间中靠得很近。例如,“我喜欢狗”和“我喜欢小狗”的嵌入应该非常相似,因为句子的含义本质上是相同的。

So what does this have to do with search? Well, if you pre-compute the embeddings for some fixed number of documents, you can use the same technique for an input query, and compute the distance between the input query and all documents on hand. In the end, you’ll have a ranking of documents in order of their similarity to the input query.
那么这与搜索有什么关系呢?好吧,如果您预先计算了一些固定数量文档的嵌入,则可以对输入查询使用相同的技术,并计算输入查询与手头所有文档之间的距离。最后,您将按照与输入查询的相似性对文档进行排名。

Elasticsearch supports this kind of similarity search with dense_vector types, which allow you to store embeddings from models like BERT. Elasticsearch can then perform an approximate K-Nearest Neighbors search to determine the most similar documents to your query.
Elasticsearch 支持这种使用 dense_vector 类型的相似性搜索,它允许您存储来自 BERT 等模型的嵌入。然后,Elasticsearch 可以执行近似的 K 最近邻搜索来确定与您的查询最相似的文档。

Knowing that, create a new file wine_index.json in a priv/elastic directory and copy the following:
知道后,在 priv/elastic 目录中创建一个新文件 wine_index.json 并复制以下内容:

{
  "mappings": {
    "properties": {
      "document-vector": {
        "type": "dense_vector",
        "dims": 768,
        "index": true,
        "similarity": "l2_norm"
      }
    }
  }
}

Next, run the following to create a new index in elasticsearch:
接下来,运行以下命令在 elasticsearch 中创建一个新索引:

curl --cacert http_ca.crt -XPUT -d @priv/elastic/wine_index.json -u elastic https://localhost:9200/wine -H 'Content-Type: application/json'

You should get a 200 response and a success message. You’ve successfully created an Elasticsearch index, now you need to add documents to the index!
您应该会收到 200 响应和成功消息。您已经成功创建了一个 Elasticsearch 索引,现在您需要将文档添加到索引中!

Embedding scraped wine products
嵌入刮酒产品

With your index configured, you need to compute embeddings for each of the wines you have and add them to your index.
配置索引后,您需要为您拥有的每种葡萄酒计算嵌入并将它们添加到索引中。

Start by creating a new Elixir script in priv called embed_wine_documents.exs. This script will house the code for embedding wine documents–in a real application, you might choose to move this code into your application. For example, you’d probably want to periodically scrape the web for more wine products and update the index regularly.
首先在 priv 中创建一个名为 embed_wine_documents.exs 的新 Elixir 脚本。该脚本将包含用于嵌入 wine 文档的代码 —— 在实际应用程序中,您可以选择将此代码移动到您的应用程序中。例如,您可能希望定期在网上抓取更多葡萄酒产品并定期更新索引。

Fortunately, the logic for embedding and adding documents to an index follows regardless of if it’s in a script or your application.
幸运的是,无论是在脚本中还是在您的应用程序中,都遵循将文档嵌入和添加到索引的逻辑。

To embed documents and add them to the Elasticsearch index, you’ll need a few dependencies:
要嵌入文档并将它们添加到 Elasticsearch 索引中,您需要一些依赖项:

Mix.install([
  {:httpoison, "~> 1.8"},
  {:jason, "~> 1.3"},
  {:axon_onnx, "~> 0.2.0-dev", github: "elixir-nx/axon_onnx"},
  {:axon, "~> 0.2.0-dev", github: "elixir-nx/axon", override: true},
  {:exla, "~> 0.3.0-dev", github: "elixir-nx/nx", sparse: "exla"},
  {:nx, "~> 0.3.0-dev", github: "elixir-nx/nx", sparse: "nx", override: true},
  {:tokenizers, "~> 0.1.0-dev", github: "elixir-nx/tokenizers", branch: "main"},
  {:rustler, ">= 0.0.0", optional: true}
])

Some of these might be familiar.
其中一些可能很熟悉。

You’re probably familiar with httpoison and jason–both of these libraries are necessary to make requests to the Elasticsearch API.
您可能熟悉 httpoisonjason —— 这两个库都是向 Elasticsearch API 发出请求所必需的。

I’ve written about nx, exla, and axon in previous posts.
我在之前的帖子中写过 nxexlaaxon

nx and exla are the foundations of the Nx ecosystem, and axon is a deep learning library in Elixir. axon_onnx is a new library which allows you to import ONNX neural networks.
nxexla 是 Nx 生态系统的基础, axon 是 Elixir 中的深度学习库。 axon_onnx 是一个新的库,它允许您导入 ONNX 神经网络。

ONNX is an open neural network serialization format supported by most of the Python ecosystem. With axon_onnx, you can leverage the massive Python ecosystem of pre-trained models directly in Elixir.
ONNX 是一种开放的神经网络序列化格式,得到大多数 Python 生态系统的支持。借助 axon_onnx ,您可以直接在 Elixir 中利用庞大的预训练模型 Python 生态系统。

Finally, tokenizers is a library that offers bindings around HuggingFace tokenizers. Most transformer models like BERT make use of custom tokenizers that pre-process sequences in a probabilistic way. You’ll need tokenizers to pre-process input data for use in a pre-trained BERT model.
最后, tokenizers 是一个提供围绕 HuggingFace 分词器绑定的库。像 BERT 这样的大多数转换器模型都使用自定义分词器,这些分词器以概率方式预处理序列。您需要 tokenizers 来预处理输入数据,以便在预训练的 BERT 模型中使用。

Now, create a module called EmbedWineDocuments and add the following code:
现在,创建一个名为 EmbedWineDocuments 的模块并添加以下代码:

defmodule EmbedWineDocuments do
  alias Tokenizers.{Tokenizer, Encoding}

  def format_document(document) do
    "Name: #{document["name"]}\n" <>
      "Varietal: #{document["varietal"]}\n" <>
      "Location: #{document["location"]}\n" <>
      "Alcohol Volume: #{document["alcohol_volume"]}\n" <>
      "Alcohol Percent: #{document["alcohol_percent"]}\n" <>
      "Price: #{document["price"]}\n" <>
      "Winemaker Notes: #{document["notes"]}\n" <>
      "Reviews:\n#{format_reviews(document["reviews"])}"
  end

  defp format_reviews(reviews) do
    reviews
    |> Enum.map(fn review ->
      "Reviewer: #{review["author"]}\n" <>
        "Review: #{review["review"]}\n" <>
        "Rating: #{review["rating"]}"
    end)
    |> Enum.join("\n")
  end
end

        
          
        
      

The wine documents are JSON files, but BERT only works on text–this code formats a document into a string representation. Next, you’ll need to tokenize the input to generate for use in an Axon model. Add the following code to the module:
wine 文档是 JSON 文件,但 BERT 仅适用于文本 —— 此代码将文档格式化为字符串表示形式。接下来,您需要将输入标记化以生成以用于 Axon 模型。将以下代码添加到模块中:

  def encode_text(tokenizer, text, max_sequence_length) do
    {:ok, encoding} = Tokenizer.encode(tokenizer, text)

    encoded_seq =
      encoding
      |> Enum.map(&Encoding.pad(&1, max_sequence_length))
      |> Enum.map(&Encoding.truncate(&1, max_sequence_length))

    input_ids = encoded_seq |> Enum.map(&Encoding.get_ids/1) |> Nx.tensor()
    token_type_ids = encoded_seq |> Enum.map(&Encoding.get_type_ids/1) |> Nx.tensor()
    attention_mask = encoded_seq |> Enum.map(&Encoding.get_attention_mask/1) |> Nx.tensor()

    %{
      "input_ids" => input_ids,
      "token_type_ids" => token_type_ids,
      "attention_mask" => attention_mask
    }
  end

        
          
        
      

This function computes the inputs required for the BERT model from a tokenizer and input text. Notice you also must provide a max sequence length. Axon requires fixed input shapes so, to handle variable sequence lengths, it’s common to pad or truncate inputs to a certain number of tokens.
此函数计算来自分词器和输入文本的 BERT 模型所需的输入。请注意,您还必须提供最大序列长度。 Axon 需要固定的输入形状,因此,为了处理可变序列长度,通常将输入填充或截断为一定数量的标记。

Finally, you’ll want to implement a function for computing embeddings using Axon:
最后,您需要使用 Axon 实现计算嵌入的函数:

  def compute_embedding(model, params, inputs) do
    Axon.predict(model, params, inputs, compiler: EXLA)
  end

        
          
        
      

The next step is to create a pipeline that iterates through each entry in your collection of wine documents, computes its embedding, and adds it to your existing Elasticsearch index. Start by adding the following code immediately after EmbedWineDocuments:
下一步是创建一个管道,该管道遍历您的 wine 文档集合中的每个条目,计算其嵌入,并将其添加到您现有的 Elasticsearch 索引中。首先在 EmbedWineDocuments 之后立即添加以下代码:

max_sequence_length = 120
batch_size = 128

{bert, bert_params} =
  AxonOnnx.import("priv/models/model.onnx", batch: batch_size, sequence: max_sequence_length)

bert = Axon.nx(bert, fn {_, out} -> out end)

{:ok, tokenizer} = Tokenizers.Tokenizer.from_pretrained("bert-base-uncased")

AxonOnnx.import/2 imports an existing ONNX model into an Axon model and parameters. It takes a path to an ONNX file. We’ll generate this model in a little bit. The additional Axon.nx layer just extracts the desired output from the original model. Tokenizers.Tokenizer.from_pretrained loads a pre-trained tokenizer for use with a pre-trained model.
AxonOnnx.import/2 将现有的 ONNX 模型导入 Axon 模型和参数。它采用 ONNX 文件的路径。我们稍后会生成这个模型。额外的 Axon.nx 层只是从原始模型中提取所需的输出。 Tokenizers.Tokenizer.from_pretrained 加载预训练的分词器以与预训练的模型一起使用。

In this example, the name of the model you’re using is the bert-base-uncased model, so you need to use its accompanying tokenizer.
在此示例中,您使用的模型名称是 bert-base-uncased 模型,因此您需要使用其附带的分词器。

Now, add the following code to complete your embedding pipeline:
现在,添加以下代码以完成嵌入管道:

path_to_wines = "priv/wine_documents.jsonl"
endpoint = "https://localhost:9200/wine/_doc/"
password = System.get_env("ELASTICSEARCH_PASSWORD")
credentials = "elastic:#{password}"
headers = [
  Authorization: "Basic #{Base.encode64(credentials)}",
  "Content-Type": "application/json"
]
options = [ssl: [cacertfile: "http_ca.crt"]]

document_stream =
  path_to_wines
  |> File.stream!()
  |> Stream.map(&Jason.decode!/1)
  |> Stream.map(fn document -> {document["url"], EmbedWineDocuments.format_document(document)} end)
  |> Stream.chunk_every(batch_size)
  |> Stream.flat_map(fn batches ->
    {urls, texts} = Enum.unzip(batches)
    inputs = EmbedWineDocuments.encode_text(tokenizer, texts, max_sequence_length)
    embedded = EmbedWineDocuments.compute_embedding(bert, bert_params, inputs)

    embedded
    |> Nx.to_batched(1)
    |> Enum.map(&Nx.to_flat_list(Nx.squeeze(&1)))
    |> Enum.zip_with(urls, fn vec, url -> %{"url" => url, "document-vector" => vec} end)
    |> Enum.map(&Jason.encode!/1)
  end)
  |> Stream.map(fn data ->
    {:ok, _} = HTTPoison.post(endpoint, data, headers, options)
    :ok
  end)

Enum.reduce(document_stream, 0, fn :ok, counter ->
  IO.write("\rDocuments Embedded: #{counter}")
  counter + 1
end)

The first few lines are just metadata for loading wine documents and sending requests to the Elasticsearch server.
前几行只是用于加载 wine 文档和向 Elasticsearch 服务器发送请求的元数据。

The document_stream streams lines from the wine document, parses them using Jason, and then turns the parsed text into a batched tensor using some of the convenience functions you defined early on. The embedding is computed with Axon.predict(bert, bert_params, inputs, compiler: EXLA). Notice we perform the embedding computation on batches of input tensors as it’s more efficient for inference than predicting on single tensors at once.
document_stream 从 wine 文档流式传输行,使用 Jason 解析它们,然后使用您早期定义的一些便利函数将解析的文本转换为批处理的张量。嵌入是用 Axon.predict(bert, bert_params, inputs, compiler: EXLA) 计算的。请注意,我们对批量输入张量执行嵌入计算,因为它比一次预测单个张量更有效地进行推理。

This code sends just the URL and document vector to the index. You can store additional information if you’d like as long as you at least store the document-vector field.
此代码仅将 URL 和文档向量发送到索引。只要您至少存储 document-vector 字段,就可以存储其他信息。

Before running your script, you need to actually create the ONNX model for import. The HuggingFace Transformers library is a Python library that implements a large number of state-of-the-art pre-trained models. They have an out-of-the-box conversion tool that converts existing models from PyTorch to ONNX. Assuming you don’t already have it installed, you’ll need to install Python and pip, and then install the transformers library:
在运行脚本之前,您需要实际创建用于导入的 ONNX 模型。 HuggingFace Transformers 库是一个 Python 库,它实现了大量最先进的预训练模型。他们有一个开箱即用的转换工具,可以将现有模型从 PyTorch 转换为 ONNX。假设你还没有安装它,你需要安装 Python 和 pip,然后安装 transformers 库:

pip install transformers

Next, you’ll want to import the bert-base-uncased model from the command line using:
接下来,您需要使用以下命令从命令行导入 bert-base-uncased 模型:

python -m transformers.onnx --model=bert-base-uncased priv/models/

This will save a model.onnx file in a priv/models directory. The ONNX file contains the pre-trained BERT model.
这将在 priv/models 目录中保存一个 model.onnx 文件。 ONNX 文件包含预训练的 BERT 模型。

Now you can run this script with:
现在您可以运行此脚本:

$ elixir priv/embed_wine_documents.exs

It will take a few minutes to run. After it’s completed, you will have a populated index ready for search!
运行需要几分钟。完成后,您将拥有一个已填充的索引以供搜索!

Implementing search in Phoenix
在 Phoenix 中实现搜索

Now that you have a populated index, you need to handle user queries in your Phoenix application. Your application will need to:
现在您已经有了填充的索引,您需要在 Phoenix 应用程序中处理用户查询。您的申请将需要:

  1. Accept user queries from an input form
    接受来自输入表单的用户查询
  2. Compute embeddings from the text query
    从文本查询计算嵌入
  3. Search for similar embeddings with Elasticsearch
    使用 Elasticsearch 搜索相似的嵌入
  4. Return the top N similar results
    返回前N个相似结果

Start by opening up mix.exs and adding some additional dependencies:
首先打开 mix.exs 并添加一些额外的依赖项:

{:httpoison, "~> 1.8"},
{:jason, "~> 1.3"},
{:axon_onnx, "~> 0.2.0-dev", github: "elixir-nx/axon_onnx"},
{:axon, "~> 0.2.0-dev", github: "elixir-nx/axon", override: true},
{:exla, "~> 0.3.0-dev", github: "elixir-nx/nx", sparse: "exla"},
{:nx, "~> 0.3.0-dev", github: "elixir-nx/nx", sparse: "nx", override: true},
{:tokenizers, "~> 0.1.0-dev", github: "elixir-nx/tokenizers", branch: "main"},
{:rustler, ">= 0.0.0", optional: true}

        
          
        
      

You should also change floki to not only be a test dependency:
您还应该将 floki 更改为不仅是测试依赖项:

{:floki, ">= 0.30.0"},  

        
          
        
      

Notice these are the exact same dependencies you used when computing the wine embeddings for each document ahead of time. You can run mix deps.get to make sure you have everything you need. Next, open up your application.ex file and add the following line to start/0:
请注意,这些依赖项与您在提前计算每个文档的 wine 嵌入时使用的依赖项完全相同。您可以运行 mix deps.get 以确保您拥有所需的一切。接下来,打开您的 application.ex 文件并将以下行添加到 start/0

# Load the model into memory on startup
:ok = Wine.Model.load()

        
          
        
      

The Wine.Model.load/0 function will load the model on application start-up so you can handle inference requests from the Phoenix application. Next, create a model.ex file in the lib/wine/ directory and add the following code:
Wine.Model.load/0 函数将在应用程序启动时加载模型,以便您可以处理来自 Phoenix 应用程序的推理请求。接下来,在 lib/wine/ 目录下创建一个 model.ex 文件,添加如下代码:

defmodule Wine.Model do
  @max_sequence_length 120

  def load() do
    {model, params} =
      AxonOnnx.import("priv/models/model.onnx", batch: 1, sequence: max_sequence_length())

    {:ok, tokenizer} = Tokenizers.Tokenizer.from_pretrained("bert-base-uncased")

    {_, predict_fn} = Axon.compile(model, compiler: EXLA)

    predict_fn =
      EXLA.compile(
        fn params, inputs ->
          {_, pooled} = predict_fn.(params, inputs)
          Nx.squeeze(pooled)
        end,
        [params, inputs()]
      )

    :persistent_term.put({__MODULE__, :model}, {predict_fn, params})
    # Load the tokenizer as well
    :persistent_term.put({__MODULE__, :tokenizer}, tokenizer)

    :ok
  end

  def max_sequence_length(), do: @max_sequence_length

  defp inputs() do
    %{
      "input_ids" => Nx.template({1, 120}, {:s, 64}),
      "token_type_ids" => Nx.template({1, 120}, {:s, 64}),
      "attention_mask" => Nx.template({1, 120}, {:s, 64})
    }
  end
end

        
          
        
      

The most important function in this module is load. It’s responsible for loading the model for use in later inference requests.
这个模块中最重要的函数是 load 。它负责加载模型以供以后的推理请求使用。

You do this by first loading a model and tokenizer in the same way you did in your embedding script. Next, you compile the model into its initialization and prediction functions using Axon.compile/2. Finally, you make use of the new EXLA.compile/2 function, which compiles and caches a version of your function to avoid compilation overhead on execution.
为此,您首先要以与在嵌入脚本中相同的方式加载模型和分词器。接下来,使用 Axon.compile/2 将模型编译为其初始化和预测函数。最后,您使用新的 EXLA.compile/2 函数,它编译并缓存您的函数的一个版本以避免执行时的编译开销。

This function only compiles one version of the model’s predict function. In a production setting, you might want to compile additional versions for various batch sizes to handle overlapping requests.
此函数仅编译模型预测函数的一个版本。在生产环境中,您可能希望为各种批量大小编译其他版本以处理重叠请求。

After the model is compiled, you store the predict function and the model parameters using Erlang’s :persistent_term. :persistent_term provides global storage. The reason for using :persistent_term over other mechanisms is that :persistent_term avoids copying data when accessing elements–in other words, you won’t be repeatedly copying your model’s massive parameters every time you perform an inference.
编译模型后,使用 Erlang 的 :persistent_term 存储预测函数和模型参数。 :persistent_term 提供全局存储。使用 :persistent_term 而不是其他机制的原因是 :persistent_term 避免在访问元素时复制数据 —— 换句话说,您不会在每次执行推理时重复复制模型的大量参数。

Next, open up your router.ex and add a new search endpoint:
接下来,打开您的 router.ex 并添加一个新的搜索端点:

get "/:query", PageController, :index

        
          
        
      

Now, update the index clause in your PageController to look like:
现在,将 PageController 中的 index 子句更新为:

alias Tokenizers.{Tokenizer, Encoding}

def index(conn, %{"query" => query}) do
  {predict_fn, params} = get_model()

  inputs = get_inputs_from_query(query)
  embedded_vector = predict_fn.(params, inputs) |> Nx.to_flat_list()

  case get_closest_results(embedded_vector) do
    {:ok, documents} ->
      render(conn, "index.html", wine_documents: documents, query: %{"query" => query})

    _error ->
      conn
      |> put_flash(:error, "Something went wrong")
      |> render("index.html", wine_documents: [], query: %{})
  end
end

def index(conn, _params) do
  render(conn, "index.html", wine_documents: [], query: %{})
end

        
          
        
      

You haven’t implemented most of these functions yet, but the logic is relatively straightforward. When the user sends a query to the server, you access the model using a get_model function.
您尚未实现其中大部分功能,但逻辑相对简单。当用户向服务器发送查询时,您可以使用 get_model 函数访问模型。

Next, you parse the query into encoded inputs using a get_inputs_from_query method. Using the model and encoded inputs, you compute an embedded vector representation of the inputs and pass those to a get_closest_results.
接下来,您使用 get_inputs_from_query 方法将查询解析为编码输入。使用模型和编码输入,您可以计算输入的嵌入向量表示并将它们传递给 get_closest_results

Finally, you handle error and success cases in the process of retrieving similar documents.
最后,您将处理检索相似文档过程中的错误和成功案例。

Now, copy the following code to implement these missing functions:
现在,复制以下代码来实现这些缺失的功能:

defp get_tokenizer() do
  :persistent_term.get({Wine.Model, :tokenizer})
end

defp get_model() do
  :persistent_term.get({Wine.Model, :model})
end

defp get_inputs_from_query(query) do
  tokenizer = get_tokenizer()

  {:ok, encoded_seq} = Tokenizer.encode(tokenizer, query)

  encoded_seq =
    encoded_seq
    |> Encoding.pad(Wine.Model.max_sequence_length())
    |> Encoding.truncate(Wine.Model.max_sequence_length())

  input_ids = encoded_seq |> Encoding.get_ids() |> Nx.tensor()
  token_type_ids = encoded_seq |> Encoding.get_type_ids() |> Nx.tensor()
  attention_mask = encoded_seq |> Encoding.get_attention_mask() |> Nx.tensor()

  %{
    "input_ids" => Nx.new_axis(input_ids, 0),
    "token_type_ids" => Nx.new_axis(token_type_ids, 0),
    "attention_mask" => Nx.new_axis(attention_mask, 0)
  }
end

defp get_closest_results(embedded_vector) do
  options = [ssl: [cacertfile: @cacertfile_path], recv_timeout: 60_000]

  password = System.get_env("ELASTICSEARCH_PASSWORD")
  credentials = "elastic:#{password}"

  headers = [
    Authorization: "Basic #{Base.encode64(credentials)}",
    "Content-Type": "application/json"
  ]

  query = format_query(embedded_vector)

  with {:ok, data} <- Jason.encode(query),
       {:ok, response} <- HTTPoison.post(@elasticsearch_endpoint, data, headers, options),
       {:ok, results} <- Jason.decode(response.body) do
    parse_response(results)
  else
    _error ->
      :error
  end
end

        
          
        
      

Most of these functions should be relatively straightforward.
这些功能中的大多数应该相对简单。

get_inputs_from_query is almost identical to the encode_text function you used to compute document embeddings.
get_inputs_from_query 几乎与用于计算文档嵌入的 encode_text 函数相同。

The get_closest_results function implements the logic of querying Elasticsearch and handling the response.
get_closest_results 函数实现了查询 Elasticsearch 和处理响应的逻辑。

Both options and headers are similar to the options and headers you used to embed documents originally.
optionsheaders 都类似于您最初用于嵌入文档的选项和标题。

Next, you use a missing format_query function to convert the dense vector into a valid Elasticsearch query. Then, you encode the query, send the request to your running Elasticsearch server, and decode the response. If any of the steps fail along the way, you return an error.
接下来,您使用缺少的 format_query 函数将密集向量转换为有效的 Elasticsearch 查询。然后,您对查询进行编码,将请求发送到正在运行的 Elasticsearch 服务器,并对响应进行解码。如果任何步骤在此过程中失败,您将返回一个错误。

Now you just need to implement the remaining helper functions. Add the following code to PageController:
现在你只需要实现剩下的辅助函数。将以下代码添加到 PageController

defp format_query(vector) do
  %{
    "knn" => %{
      "field" => "document-vector",
      "query_vector" => vector,
      "k" => @top_k,
      "num_candidates" => @num_candidates
    },
    "_source" => ["url"]
  }
end

defp parse_response(response) do
  hits = get_in(response, ["hits", "hits"])
  case hits do
    nil ->
      :error

    hits ->
      results = Enum.map(hits, fn
        %{"_source" => result} ->
          url = result["url"]
          get_wine_preview(url)
      end)

      {:ok, results}
  end
end

defp get_wine_preview(url) do
  with {:ok, %{body: body}} <- HTTPoison.get(url),
       {:ok, page} <- Floki.parse_document(body) do
    title = page |> Floki.find(".pipName") |> Floki.text()
    %{url: url, title: title}
  else
    _error ->
      %{url: url, title: "Generic wine"}
  end
end

        
          
        
      

format_query/1 builds an Elasticsearch K-Nearest Neighbors query from a given dense vector. It takes a few parameters that control how the search works: k and num_candidates. k controls the top k number of responses to return and num_candidates controls the number of candidate solutions to search from.
format_query/1 从给定的密集向量构建 Elasticsearch K-最近邻查询。它需要一些参数来控制搜索的工作方式: knum_candidatesk 控制要返回的前 k 响应数, num_candidates 控制要搜索的候选解决方案的数量。

The search uses heuristics to trim the final search down to num_candidates documents before rank ordering the final num_candidates documents. The higher num_candidates, the more expansive the search becomes.
在对最终的 num_candidates 文档进行排序之前,该搜索使用启发式方法将最终搜索缩减为 num_candidates 文档。 num_candidates 越高,搜索范围就越大。

parse_response/1 and get_wine_preview/1 both handle rendering useful information about returned wines to the user.
parse_response/1get_wine_preview/1 都处理向用户呈现有关返回的葡萄酒的有用信息。

parse_response grabs the relevant information from a successful response–returning an error if it’s not present. get_wine_preview/1 uses HTTPoison and Floki to generate a preview from the URL of a given wine. You can make use of a database or Elasticsearch to track relevant information for each document as well.
parse_response 从成功响应中获取相关信息 —— 如果不存在则返回错误。 get_wine_preview/1 使用 HTTPoisonFloki 从给定葡萄酒的 URL 生成预览。您也可以使用数据库或 Elasticsearch 来跟踪每个文档的相关信息。

Finally, you might have noticed a few constants present in some controller functions. You should declare each of them at the top of your module, like this:
最后,您可能已经注意到某些控制器函数中存在一些常量。您应该在模块的顶部声明它们中的每一个,如下所示:

@elasticsearch_endpoint "https://localhost:9200/wine/_knn_search"
@cacertfile_path "http_ca.crt"
@top_k 5
@num_candidates 100

        
          
        
      

With your controller complete, all you have left to do is adjust your index.html.heex template to handle search and search results. To do that, adjust your index.html.heex file to look like:
控制器完成后,您剩下要做的就是调整您的 index.html.heex 模板以处理搜索和搜索结果。为此,请将您的 index.html.heex 文件调整为:

<section class="row">
  <form method="get" action="/">
    <input type="text" name="query" />
    <input type="submit" />
  </form>
</section>

<section class="row">
  <%= unless @wine_documents == [] do %>
    <h2>Results: </h2>
    <div class="container">
      <ul>
        <%= for wine <- @wine_documents do %>
          <li><a href={wine.url}><%= wine.title %></a></li>
        <% end %>
      </ul>
    </div>
  <% end %>
</section>

        
          
        
      

This will render a simple form and any search results that are present. That’s all you need! You’re ready to search for wines.
这将呈现一个简单的表单和存在的任何搜索结果。这就是您所需要的!您已准备好搜索葡萄酒。

Searching for wines 寻找葡萄酒

With your application complete, you can fire up your server with:
应用程序完成后,您可以启动服务器:

mix phx.server

Navigate to the browser and enter a search like, “I want a Cabernet Sauvignon that’s under $25”. The first query might take a little bit of time to run, subsequent queries should be much faster. After some time you will see:
导航到浏览器并输入搜索,例如“我想要 25 美元以下的赤霞珠”。第一个查询可能需要一点时间来运行,后续查询应该快得多。一段时间后你会看到:

Congratulations, you’ve just implemented a Semantic Search tool in pure Elixir!
恭喜,您刚刚在纯 Elixir 中实现了语义搜索工具!

Conclusion 结论

In this post, you learned how to create a semantic search tool for matching wines to users from natural language queries.
在本文中,您学习了如何创建一个语义搜索工具,用于通过自然语言查询将葡萄酒与用户匹配。

Notice that you didn’t need to learn how to train any complex models. Axon and AxonOnnx enable you to take advantage of the powerful pre-trained models from other ecosystems natively in an Elixir application.
请注意,您不需要学习如何训练任何复杂模型。 Axon 和 AxonOnnx 使您能够在 Elixir 应用程序中利用来自其他生态系统的强大的预训练模型。

You should also notice that integrating Nx, EXLA, and Axon into an existing Phoenix application is relatively painless.
您还应该注意到,将 Nx、EXLA 和 Axon 集成到现有的 Phoenix 应用程序中是相对轻松的。

I hope this gives you some ideas about how the budding Elixir machine learning ecosystem can benefit you–even if you don’t want to learn how to train models.
我希望这能给你一些关于新兴的 Elixir 机器学习生态系统如何使你受益的想法 —— 即使你不想学习如何训练模型。

Until next time! 直到下一次!

Axon.Serving: Model Serving with Axon and Elixir
Axon.Serving:使用 Axon 和 Elixir 的模型服务

Neurons against a dark background

Machine Learning Advisor
机器学习顾问

Sean Moriarity  肖恩·莫里亚蒂

Since publishing, this approach to model serving has been deprecated. Find the new approach here.
自发布以来,这种模型服务方法已被弃用。在这里找到新方法。

When going from a notebook to a production environment, there are lots of considerations you need to take into account.
从笔记本转向生产环境时,您需要考虑很多因素。

In previous posts, I’ve written a little bit about how to integrate Axon models into production environments with both native (Elixir-based) solutions and external model serving solutions.
在之前的帖子中,我写了一些关于如何将 Axon 模型集成到具有本地(基于 Elixir 的)解决方案和外部模型服务解决方案的生产环境中。

In this post, I’m excited to introduce a new feature of Axon: Axon.Serving.
在这篇文章中,我很高兴介绍 Axon 的一项新功能: Axon.Serving

What is Axon.Serving? 什么是 Axon.Serving

Axon.Serving is a minimal model serving solution written entirely in Elixir. Axon.Serving can integrate directly into your existing applications with only a few lines of code. Axon.Serving makes minimal assumptions about your application’s needs, but implements some critical features for deploying fault-tolerant, low-latency models.
Axon.Serving 是一个完全用 Elixir 编写的最小模型服务解决方案。 Axon.Serving 只需几行代码就可以直接集成到您现有的应用程序中。 Axon.Serving 对应用程序的需求做出最少的假设,但实现了一些用于部署容错、低延迟模型的关键特性。

In other ecosystems, users are encouraged to use dedicated serving solutions such as TensorFlow Serving, TorchServe, and NVIDIA’s Triton Server. These are production-ready serving solutions with impressive feature-sets; however, they’re mostly designed to overcome limitations in the lingua franca of machine learning–Python.
在其他生态系统中,鼓励用户使用专用的服务解决方案,例如 TensorFlow Serving、TorchServe 和 NVIDIA 的 Triton Server。这些是具有令人印象深刻的功能集的生产就绪服务解决方案;然而,它们主要是为了克服机器学习通用语言 Python 的局限性而设计的。

Model serving solutions introduce an additional service into your application that you must manage and query with platform-specific GRPC or HTTP APIs. Axon.Serving is a pure Elixir solution that can integrate directly into your existing applications without introducing an additional service.
模型服务解决方案将附加服务引入到您的应用程序中,您必须使用特定于平台的 GRPC 或 HTTP API 来管理和查询该服务。 Axon.Serving 是一个纯 Elixir 解决方案,可以直接集成到您现有的应用程序中,而无需引入额外的服务。

Additionally, because Axon builds on top of Nx, Axon.Serving is runtime-agnostic by default. That means you can take advantage of various backends and compilers, tuning for specific deployment scenarios just by changing some configuration options. Other serving solutions such as NVIDIA’s Triton Server tout similar features–allowing you to deploy TensorFlow, PyTorch, ONNX, and more models under a unified API.
此外,由于 Axon 构建于 Nx 之上, Axon.Serving 默认情况下与运行时无关。这意味着您可以利用各种后端和编译器,只需更改一些配置选项即可针对特定部署场景进行调整。其他服务解决方案,如 NVIDIA 的 Triton Server 也具有类似的功能 —— 允许您在统一的 API 下部署 TensorFlow、PyTorch、ONNX 和更多模型。

However, Triton is designed specifically with server deployments in mind. On the other hand, there’s nothing that would limit you from using Axon.Serving in a mobile deployment environment. It’s feasible to imagine implementing a mobile application with LiveView Native, and deploying on-device models with Axon running on a CoreML backend.
但是,Triton 是专门为服务器部署而设计的。另一方面,没有什么会限制您在移动部署环境中使用 Axon.Serving 。可以想象使用 LiveView Native 实现移动应用程序,并使用在 CoreML 后端运行的 Axon 部署设备上的模型。

Why can’t I just use Axon.predict/3?
为什么我不能只使用 Axon.predict/3

At first glance, it might not make sense why you would want to use the Axon.Serving API over just using Axon’s normal inference APIs. In some settings, such as certain batch-inference scenarios, it might make sense to just continue using more straightforward inference APIs like Axon.predict/3.
乍一看,为什么你想使用 Axon.Serving API 而不是仅使用 Axon 的正常推理 API 可能没有意义。在某些设置中,例如某些批处理推理场景,继续使用更直接的推理 API(如 Axon.predict/3 )可能是有意义的。

However, for online-serving solutions where model requests are made at irregular intervals one inference request at a time, Axon.Serving is a must for ensuring low latency.
但是,对于不定期发出模型请求的在线服务解决方案,一次一个推理请求, Axon.Serving 是确保低延迟的必要条件。

This first release of Axon.Serving implements two performance-critical features for serving scenarios:
Axon.Serving 的第一个版本为服务场景实现了两个性能关键特性:

  1. Eager model compilation with fixed shapes
    具有固定形状的 Eager 模型编译
  2. Dynamic batch queue 动态批队列

Axon depends on Nx, which makes use of JIT-compilation. When running Nx functions (like Axon models) for the first time, there is a slight compilation overhead. If you run functions with different shapes or types, you will incur a compilation cost every time.
Axon 依赖于 Nx,它使用 JIT 编译。第一次运行 Nx 函数(如 Axon 模型)时,会有轻微的编译开销。如果您运行具有不同形状或类型的函数,则每次都会产生编译成本。

Axon.Serving works by using Nx’s eager compilation API, Nx.Defn.compile, to JIT-compile inference functions on application start-up. This means when users make requests to your application, they use the compiled function and do not incur any compilation cost.
Axon.Serving 通过使用 Nx 的热切编译 API Nx.Defn.compile 在应用程序启动时对推理函数进行 JIT 编译。这意味着当用户向您的应用程序发出请求时,他们使用已编译的函数并且不会产生任何编译成本。

In addition to compilation cost, there is a lot of overhead when switching between ERTS and an Nx backend or compiler’s runtime.
除了编译成本之外,在 ERTS 和 Nx 后端或编译器的运行时之间切换时还有很多开销。

When using a GPU, there is even more overhead for transferring data to the GPU. In training scenarios, these overheads are offset with large batch sizes.
使用 GPU 时,将数据传输到 GPU 的开销更大。在训练场景中,这些开销会被大批量抵消。

Using a batch size of one on modern GPUs ends up wasting significant amounts of resources. That’s because GPU latency is not very sensitive to batch size. A model with an average latency of 100ms at batch-size one will generally exhibit the same latencies (up to a certain point) as when you scale the batch size up.
在现代 GPU 上使用一个批量大小最终会浪费大量资源。这是因为 GPU 延迟对批量大小不是很敏感。批量大小平均延迟为 100 毫秒的模型通常会表现出与扩大批量大小时相同的延迟(达到某个点)。

Rather than service one request with batch-size one at a time, you should try to service requests in bulk to avoid bottlenecks.
与其一次处理一个批量大小的请求,不如尝试批量处理请求以避免瓶颈。

Imagine a scenario where you receive requests at 20ms intervals for a model that takes 100ms to process. If you receive five requests, the fifth request will have a perceived latency of 500ms if your application is designed to service model requests one at a time. If you instead choose to batch requests, you can significantly lower the perceived latency of later requests by sacrificing some latency for earlier requests.
想象这样一个场景,您以 20 毫秒的间隔收到一个模型的请求,而这个模型需要 100 毫秒来处理。如果您收到五个请求,如果您的应用程序设计为一次处理一个模型请求,那么第五个请求将有 500 毫秒的感知延迟。如果您改为选择批处理请求,则可以通过牺牲较早请求的一些延迟来显着降低较晚请求的感知延迟。

In order to achieve this bulk-processing effect, Axon.Serving implements a dynamic batch queue.
为了实现这种批量处理效果, Axon.Serving 实现了一个动态批处理队列。

When configuring your model to use Axon.Serving, you specify a maximum batch size and a maximum wait time. For example, if you specify a maximum batch size of 16 and a maximum wait time of 25, your model will process requests in batches of 16, waiting at most 25ms for the queue to fill up before executing model inference.
将模型配置为使用 Axon.Serving 时,您可以指定最大批量大小和最长等待时间。例如,如果您指定最大批量大小为 16 和最长等待时间为 25 ,您的模型将以 16 个为一组处理请求,在执行模型推理之前最多等待 25 毫秒让队列填满。

Rather than servicing requests eagerly, Axon.Serving delays inference until either the queue fills up, or the maximum batch size is reached. To meet Nx’s static-shape requirements, Axon.Serving sacrifices some memory efficiency by padding all batches to the given maximum batch size.
Axon.Serving 不是急切地为请求提供服务,而是延迟推理,直到队列填满或达到最大批大小。为了满足 Nx 的静态形状要求, Axon.Serving 通过将所有批次填充到给定的最大批次大小来牺牲一些内存效率。

Despite the simplicity of these two features, Axon.Serving is capable of achieving competitive performance with other model serving frameworks.
尽管这两个功能很简单,但 Axon.Serving 能够实现与其他模型服务框架竞争的性能。

I recently shared benchmarks of Axon.Serving versus TorchServe on the same model. Axon.Serving using EXLA integrated with a vanilla Phoenix application is actually more performant than TorchServe.
我最近在同一模型上分享了 Axon.Serving 与 TorchServe 的基准测试。 Axon.Serving 使用与普通 Phoenix 应用程序集成的 EXLA 实际上比 TorchServe 性能更高。

How do I use Axon.Serving?
我如何使用 Axon.Serving

Axon.Serving requires minimal changes to your application. In an existing Phoenix application, you can start an Axon.Serving instance by adding the following to your application.ex:
Axon.Serving 需要对您的应用程序进行最少的更改。在现有的 Phoenix 应用程序中,您可以通过将以下内容添加到您的 application.ex 来启动一个 Axon.Serving 实例:

# start a ResNet instance
{Axon.Serving, model: MyApp.Models.load_resnet(),
			   name: :resnet,
			   shape: {nil, 3, 224, 224},
			   batch_size: 16,
			   batch_timeout: 100,
			   compiler: EXLA}

        
          
        
      

This will start a serving instance named :resnet. :model must be a tuple of {model, params}–in this example you dispatch the model loading code to a separate module. :shape indicates your model’s input shapes. The specified shape must be compatible with the shape specified in model. :batch_size and :batch_timeout represent the maximum dynamic queue size and queue timeout respectively. All other options (e.g. :compiler) are forwarded to Nx.Defn.compile.
这将启动一个名为 :resnet 的服务实例。 :model 必须是 {model, params} 的元组 —— 在此示例中,您将模型加载代码分派到一个单独的模块。 :shape 表示模型的输入形状。指定的形状必须与 model 中指定的形状兼容。 :batch_size:batch_timeout 分别代表最大动态队列大小和队列超时。所有其他选项(例如 :compiler )都转发给 Nx.Defn.compile

Now, whenever you want to get predictions from your model, you can use Axon.Serving.predict/2:
现在,无论何时你想从你的模型中得到预测,你都可以使用 Axon.Serving.predict/2

defmodule MyAppWeb.ImageController do
  use MyAppWeb, :controller

  def predict(conn, %{"image" => image}) do
    image_tensor = normalize_input(image)
    result = Axon.Serving.predict(:resnet, image_tensor)
    normalize_and_render_output(conn, result)
  end
end

        
          
        
      

Under the hood, Axon.Serving will compile the ResNet model on application start-up. Overlapping requests to :resnet will be batched automatically.
在后台, Axon.Serving 将在应用程序启动时编译 ResNet 模型。对 :resnet 的重叠请求将自动批处理。

And that’s it! Like I said, Axon.Serving is intentionally minimalistic. With a few lines of Elixir, you can replace an entire external model serving service.
就是这样!就像我说的, Axon.Serving 是有意简约的。使用几行 Elixir,您可以替换整个外部模型服务。

If your application is already using Elixir but defers to Python and a model serving solution for machine learning, you might find it to be a serious quality of life improvement to convert your model into a format Axon can work with, and make use of Axon.Serving integrated directly with your Elixir application.
如果您的应用程序已经在使用 Elixir 但遵从 Python 和用于机器学习的模型服务解决方案,您可能会发现将模型转换为 Axon 可以使用的格式并利用 @0 会大大提高生活质量# 直接与您的 Elixir 应用程序集成。

What Axon.Serving can’t do  Axon.Serving 不能做什么

Axon.Serving is intentionally minimal. Model serving solutions like TorchServe and Triton are batteries included. They implement things like response caching, rate limiting, model management, and more out-of-the-box.
Axon.Serving 是有意最小化的。 TorchServe 和 Triton 等模型服务解决方案都包含电池。他们实现了响应缓存、速率限制、模型管理等开箱即用的功能。

Axon.Serving takes a less opinionated approach, allowing you to work these features into your application as you see fit. If you’re looking for something that’s batteries-included, you will probably want to consider other options.
Axon.Serving 采取了一种不那么固执己见的方法,允许您在您认为合适的情况下将这些功能用于您的应用程序。如果您正在寻找带电池的产品,您可能需要考虑其他选择。

Conclusion 结论

Axon.Serving is an exciting new feature that makes integrating Axon into production applications seamless.
Axon.Serving 是一项令人兴奋的新功能,它可以将 Axon 无缝集成到生产应用程序中。

However, I’d like to emphasize that Axon.Serving is new. You might encounter edge-cases and missing features. I encourage you to experiment with Axon.Serving and report any and all issues, failure-cases, performance problems, etc.
但是,我想强调 Axon.Serving 是新的。您可能会遇到边缘情况和缺失的功能。我鼓励您尝试使用 Axon.Serving 并报告所有问题、失败案例、性能问题等。

Additionally, there are many features we are experimenting with and considering adding to the API. If there’s anything you think should be included, don’t hesitate to open an issue.
此外,还有许多我们正在试验并考虑添加到 API 中的功能。如果您认为应该包含任何内容,请不要犹豫,提出一个问题。

Until next time! 直到下一次!

Unlocking the power of Transformers with Bumblebee
用 Bumblebee 解锁变形金刚的力量

A bumblebee covered in pollen on a purple flower against a bright blue background

Machine Learning Advisor
机器学习顾问

Sean Moriarity  肖恩·莫里亚蒂

Introduction 介绍

Perhaps the most popular machine learning library in existence is the HuggingFace Transformers library.
也许现存最流行的机器学习库是 HuggingFace Transformers 库。

HuggingFace is a platform focused on democratizing artificial intelligence. Their transformers library implements hundreds of transformer models with which you can load pre-trained checkpoints from the most popular AI labs around the world including Facebook, Microsoft, and Google.
HuggingFace 是一个专注于人工智能民主化的平台。他们的 transformers 库实现了数百个 transformer 模型,您可以使用这些模型加载来自世界各地最流行的 AI 实验室(包括 Facebook、Microsoft 和 Google)的预训练检查点。

The transformers library also simplifies the process of using these models for both training and inference. With HuggingFace Transformers, you can implement machine translation, text completion, text summarization, zero-shot classification, and much more in just a few lines of Python code.
transformers 库还简化了使用这些模型进行训练和推理的过程。借助 HuggingFace Transformers,您只需几行 Python 代码即可实现机器翻译、文本补全、文本摘要、零样本分类等。

And now, with Elixir’s recently announced Bumblebee library, you can do all of the same things with a few lines of Elixir code.
现在,借助 Elixir 最近发布的 Bumblebee 库,您可以使用几行 Elixir 代码完成所有相同的事情。

What are Transformers? 什么是变形金刚?

Transformer models are a type of deep learning architecture that have achieved state-of-the-art performance in a wide range of tasks in natural language processing, computer vision, and more. The most famous models in the tech world including GPT3 and DALL-E make use of a transformer architecture.
Transformer 模型是一种深度学习架构,在自然语言处理、计算机视觉等领域的广泛任务中取得了最先进的性能。科技界最著名的模型,包括 GPT3 和 DALL-E,都使用了变压器架构。

Transformer models, in contrast to other architectures, scale exceptionally well.
与其他架构相比,Transformer 模型的扩展性非常好。

More data and a larger model generally yield better performance for transformers. While scale yields better models, it also requires more resources to train. That means it’s more difficult for individual researchers and small businesses to take advantage of the power of transformers.
更多的数据和更大的模型通常会为变压器带来更好的性能。虽然规模会产生更好的模型,但它也需要更多的资源来训练。这意味着个人研究人员和小型企业更难利用变压器的力量。

To bridge this gap, HuggingFace offers a hub for large labs to resource their models to the public–allowing individuals to take advantage of massive models without the need to train them on their own.
为了弥合这一差距,HuggingFace 为大型实验室提供了一个中心,将他们的模型资源提供给公众 —— 允许个人利用大量模型而无需自己训练。

What is Bumblebee? 什么是大黄蜂?

Bumblebee is a project out of the Elixir Nx ecosystem which aims to implement a number of pre-trained Axon models and integrate with popular model “hubs”–most notably HuggingFace Hub.
Bumblebee 是 Elixir Nx 生态系统中的一个项目,旨在实施大量预训练的 Axon 模型并与流行的模型“集线器”(最著名的是 HuggingFace Hub)集成。

The Python ecosystem has a number of model hubs that essentially act as git repositories for pre-trained model parameters and checkpoints. The HuggingFace hub is easily the most popular with over 60,000 pre-trained models and millions (or more) of downloads across all models.
Python 生态系统有许多模型中心,它们本质上充当预训练模型参数和检查点的 git 存储库。 HuggingFace hub 无疑是最受欢迎的,拥有超过 60,000 个预训练模型和数百万(或更多)所有模型的下载量。

Access to pre-trained models and accessible machine learning applications is a massive advantage of the Python ecosystem. Bumblebee is an attempt to bridge the gap between the Python ecosystem and the Elixir ecosystem–without imposing any runtime constraints on the user.
访问预先训练的模型和可访问的机器学习应用程序是 Python 生态系统的巨大优势。 Bumblebee 试图弥合 Python 生态系统和 Elixir 生态系统之间的鸿沟 —— 不对用户施加任何运行时限制。

The Bumblebee library is written in 100% pure Elixir. All of the models in Bumblebee are implemented in Axon and converted from supported checkpoints using pure Elixir code. At the time of this writing, Bumblebee interfaces with HuggingFace hub and can convert PyTorch checkpoints to Axon parameters for use directly with Axon models.
Bumblebee 库是用 100% 纯 Elixir 编写的。 Bumblebee 中的所有模型都在 Axon 中实现,并使用纯 Elixir 代码从支持的检查点转换而来。在撰写本文时,Bumblebee 与 HuggingFace hub 接口,可以将 PyTorch 检查点转换为 Axon 参数,以便直接与 Axon 模型一起使用。

Much like HuggingFace Transformers, Bumblebee also aims to simplify common tasks around machine learning models such as text generation models, translation models, and more.
与 HuggingFace Transformers 非常相似,Bumblebee 也旨在简化机器学习模型的常见任务,例如文本生成模型、翻译模型等。

All of this code is written in Nx, meaning that Bumblebee can support pluggable backends and compilers out of the box. Depending on your use case, you can use the same library to target deployment scenarios at the edge all the way up to massive server deployments–just by changing the Nx backend or compiler.
所有这些代码都是用 Nx 编写的,这意味着 Bumblebee 可以支持开箱即用的可插拔后端和编译器。根据您的用例,您可以使用相同的库来定位边缘部署场景,一直到大规模服务器部署 —— 只需更改 Nx 后端或编译器即可。

At the time of this writing, Bumblebee is still in the early phases of development; however, it’s already quite powerful. In this post, I’ll highlight some of the incredible things you can do with Bumblebee right now.
在撰写本文时,Bumblebee 仍处于开发的早期阶段;但是,它已经很强大了。在这篇文章中,我将重点介绍您现在可以使用 Bumblebee 做的一些令人难以置信的事情。

Setup 设置

Before running these examples, you’ll want to fire up a Livebook and install the following dependencies:
在运行这些示例之前,您需要启动 Livebook 并安装以下依赖项:

Mix.install([
  {:bumblebee, "~> 0.1.0"},
  {:axon, "~> 0.3"},
  {:exla, "~> 0.4"},
  {:nx, "~> 0.4"}
])

        
          
        
      

You’ll also want to configure your environment to use EXLA:
您还需要配置您的环境以使用 EXLA:

Nx.global_default_backend(EXLA.Backend)

        
          
        
      

If you have access to a GPU or another accelerator, it will speed up these examples, but it’s not necessary to run them.
如果您可以访问 GPU 或其他加速器,它将加速这些示例,但没有必要运行它们。

Text Summarization 文本摘要

Text summarization is a language understanding task that requires a model to produce a condensed representation of a longform text. With the unreasonable effectiveness of transformer models, text summarization is relatively easy. With Bumblebee, you can implement text summarization in a few lines of Elixir code. First, you just need to load a model:
文本摘要是一种语言理解任务,需要一个模型来生成长文本的压缩表示。由于 transformer 模型的不合理有效性,文本摘要相对容易。使用 Bumblebee,您可以在几行 Elixir 代码中实现文本摘要。首先,您只需要加载一个模型:

model_name = "facebook/bart-large-cnn"

{:ok, model} = Bumblebee.load_model({:hf, model_name})
{:ok, tokenizer} = Bumblebee.load_tokenizer({:hf, model_name})

        
          
        
      

The model name is the name of a model checkpoint from the HuggingFace hub. This example uses a pre-trained model from Facebook. Bumblebee.load_model/1 and Bumblebee.load_tokenizer/1 do the work of importing the correct model, parameters, and tokenizer from the HuggingFace Hub. Both load_model/1 and load_tokenizer/1 are designed to support other types of model hubs. So you need the {:hf, model} tuple to tell Bumblebee that the model you’re trying to pull is from HuggingFace.
模型名称是来自 HuggingFace hub 的模型检查点的名称。此示例使用来自 Facebook 的预训练模型。 Bumblebee.load_model/1Bumblebee.load_tokenizer/1 负责从 HuggingFace Hub 导入正确的模型、参数和分词器。 load_model/1load_tokenizer/1 都是为了支持其他类型的模型集线器而设计的。因此,您需要 {:hf, model} 元组来告诉 Bumblebee 您要拉取的模型来自 HuggingFace。

Summarization is really just a generation task:
总结实际上只是一个生成任务:

serving = Bumblebee.Text.Generation.generation(model, tokenizer, min_length: 10, max_length: 20)

        
          
        
      

This will return an %Nx.Serving{} struct which wraps the generation logic in an easy-to-use struct. Now you can run your task with:
这将返回一个 %Nx.Serving{} 结构,它将生成逻辑包装在一个易于使用的结构中。现在你可以运行你的任务:

article = """
Elixir is a dynamic, functional language for building scalable and maintainable applications.\
Elixir leverages the Erlang VM, known for running low-latency, distributed, and fault-tolerant systems. Elixir is successfully used in web development, embedded software, data ingestion, and multimedia processing, across a wide range of industries
"""

Nx.Serving.run(serving, article)

        
          
        
      

And after a while you will see the summary:
一段时间后,您将看到摘要:

"Elixir is a dynamic, functional language for building scalable and maintainable applications."

        
          
        
      

Not too bad! With just three function calls you have a pretty powerful text summarization implementation!
还不错!只需三个函数调用,您就拥有了一个非常强大的文本摘要实现!

Bumblebee.apply_tokenizer/2 tokenizes the input into discrete integer tokens for the model to consume. Bumblebee’s tokenizers are backed by HuggingFace’s fast Rust tokenizer implementations.
Bumblebee.apply_tokenizer/2 将输入标记为离散整数标记,供模型使用。 Bumblebee 的分词器由 HuggingFace 的快速 Rust 分词器实现支持。

The next line Bumblebee.Text.Generation.generate uses your pre-trained model and parameters to perform text generation based on the input text. In this case, the goal of the generation task is to generate a summary of the input text.
下一行 Bumblebee.Text.Generation.generate 使用您预训练的模型和参数根据输入文本执行文本生成。在这种情况下,生成任务的目标是生成输入文本的摘要。

Finally, the Bumblebee.Tokenizer.decode/2 function takes the integer tokens generated and maps them back to string values which you can understand and interpret.
最后, Bumblebee.Tokenizer.decode/2 函数获取生成的整数标记并将它们映射回您可以理解和解释的字符串值。

Machine Translation 机器翻译

Another common language understanding task is machine translation. If you think about it, machine translation is very similar to text summarization; however, rather than mapping a longer representation to a shorter one, the goal is to map a representation in one language to an equivalent representation in another.
另一个常见的语言理解任务是机器翻译。如果您考虑一下,机器翻译与文本摘要非常相似;然而,我们的目标不是将较长的表示映射到较短的表示,而是将一种语言的表示映射到另一种语言的等效表示。

It should make sense then that you can implement the machine translation task in Bumblebee just by changing the model from your previous example:
那么,您只需更改之前示例中的模型,就可以在 Bumblebee 中实现机器翻译任务,这应该是有道理的:

model_name = "facebook/mbart-large-en-ro"

{:ok, model} = Bumblebee.load_model({:hf, model_name},
  module: Bumblebee.Text.Mbart,
  architecture: :for_conditional_generation
)
{:ok, tokenizer} = Bumblebee.load_tokenizer({:hf, model_name})

article = """
Elixir is a dynamic, functional language for building scalable and maintainable applications.
"""

serving = Bumblebee.Text.Generation.generation(model, tokenizer,
  max_new_tokens: 20,
  forced_bos_token_id: 250041
)
Nx.Serving.run(serving, article)

        
          
        
      

And after a while you will see the following output:
一段时间后,您将看到以下输出:

"un limbaj dinamic, funcţional pentru ameliora aplicaţiile cu o capacitate ridicat"

Not much has changed here! All you need to do is change the model to a machine translation model (also from Facebook), and add an additional option to Bumblebee.Text.Generation.generate/4.
这里没有太大变化!您需要做的就是将模型更改为机器翻译模型(也来自 Facebook),并向 Bumblebee.Text.Generation.generate/4 添加一个附加选项。

This option essentially tells the Bumblebee model that the translation should be in Romanian. By changing the token to one of the codes associated with any of the model’s supported languages, you can translate English sentences to one of any number of the model’s supported languages!
这个选项实质上告诉 Bumblebee 模型翻译应该是罗马尼亚语。通过将令牌更改为与任何模型支持的语言关联的代码之一,您可以将英语句子翻译成任意数量的模型支持的语言之一!

Text Completion 文本补全

Perhaps the first large language model to go mainstream was GPT3. Its completions endpoint can generate incredibly realistic text that can be useful in a number of applications such as creating realistic chatbots. With Bumblebee, you can make use of GPT3’s predecessor: GPT2.
也许第一个成为主流的大型语言模型是 GPT3。它的完成端点可以生成令人难以置信的逼真文本,这些文本可用于许多应用程序,例如创建逼真的聊天机器人。通过 Bumblebee,您可以使用 GPT3 的前身:GPT2。

All you need to do is load the model:
您需要做的就是加载模型:

model_name = "gpt2"

{:ok, model} = Bumblebee.load_model({:hf, model_name}, architecture: :for_conditional_generation)
{:ok, tokenizer} = Bumblebee.load_tokenizer({:hf, model_name})

        
          
        
      

Notice you need to specify the architecture here to tell Bumblebee you’d like to use this model for a generation task. Now with a simple prompt and a few lines of Elixir:
请注意,您需要在此处指定架构以告诉 Bumblebee 您希望将此模型用于生成任务。现在有了一个简单的提示和几行 Elixir:

prompt = "Elixir is a dynamic, functional language for"

serving = Bumblebee.Text.Generation.generation(model, tokenizer, min_length: 10, max_length: 30)
Nx.Serving.run(serving, prompt)

        
          
        
      

And after a while you will see:
过一会儿你会看到:

"Elixir is a dynamic, functional language for building and manipulating data structures. It is a powerful tool"

        
          
        
      

Once again, not bad! And for only a few lines of Elixir code, I would say this is pretty impressive!
再一次,不错!只有几行 Elixir 代码,我会说这非常令人印象深刻!

And much more… 以及更多…

These examples only scratch the surface of what Bumblebee is truly capable of. Bumblebee also supports a number of vision models that were not highlighted here.
这些例子只触及了 Bumblebee 真正能力的皮毛。 Bumblebee 还支持许多此处未突出显示的视觉模型。

In addition to the tasks presented here, you can also use models loaded in Bumblebee for zero and one-shot classification, question answering, sentence similarity, and much more. Additionally, you can make use of Bumblebee as a medium for importing pre-trained checkpoints for fine-tuning in Axon. This means you can tailor powerful models to your own specific use cases.
除了此处介绍的任务之外,您还可以使用 Bumblebee 中加载的模型进行零分类和一次性分类、问答、句子相似度等。此外,您可以使用 Bumblebee 作为媒介来导入预训练的检查点,以便在 Axon 中进行微调。这意味着您可以根据自己的特定用例定制强大的模型。

Of course, Bumblebee is still in its early stages, but it’s a promising step in the right direction that enables Elixir programmers to take advantage of powerful pre-trained models with minimal code.
当然,Bumblebee 仍处于早期阶段,但它是朝着正确方向迈出的充满希望的一步,它使 Elixir 程序员能够以最少的代码利用强大的预训练模型。

If there are models you’d like to see implemented, or pipelines not yet built-in to Bumblebee you’d like to see, feel free to open an issue. Additionally, if you’d like to get involved, we’d love to have you work with us in the Erlang Ecosystem Foundation Machine Learning Working Group.
如果您希望看到实施的模型,或者您希望看到尚未内置到 Bumblebee 中的管道,请随时提出问题。此外,如果您想参与,我们很乐意邀请​​您加入 Erlang 生态系统基金会机器学习工作组。

Until next time :)
直到下一次 :)

Stable Diffusion with Bumblebee
Bumblebee 的稳定扩散

A bumblebee hovering in front of a purple flower

Machine Learning Advisor
机器学习顾问

Sean Moriarity  肖恩·莫里亚蒂

Stable diffusion is perhaps the most popular deep learning model in use today.
稳定扩散可能是当今最流行的深度学习模型。

Stable diffusion is a powerful variant of a class of models called diffusion models which make use of a special process for generating images from random noise.
稳定扩散是称为扩散模型的一类模型的强大变体,它利用特殊过程从随机噪声中生成图像。

Stable diffusion specifically implements conditional diffusion or guided diffusion, which means you can control the output of the model with text descriptions of the image you want to render. Stable diffusion is completely open-source and now, thanks to Bumblebee, you can use it in Elixir.
稳定扩散具体实现条件扩散或引导扩散,这意味着您可以通过对要渲染的图像的文本描述来控制模型的输出。 Stable diffusion 是完全开源的,现在,多亏了 Bumblebee,您可以在 Elixir 中使用它。

Installing and Using Bumblebee
安装和使用大黄蜂

In one of my previous blog posts, I introduced the Bumblebee library and showed some examples of the power of using Bumblebee for your machine learning applications.
在我之前的一篇博文中,我介绍了 Bumblebee 库,并展示了一些示例,展示了将 Bumblebee 用于机器学习应用程序的强大功能。

At a high level, Bumblebee allows you to import models directly from remote repositories like the HuggingFace Hub. Bumblebee is capable of converting pre-trained models directly from PyTorch into Axon.
在高层次上,Bumblebee 允许您直接从远程存储库(如 HuggingFace Hub)导入模型。 Bumblebee 能够将预训练模型直接从 PyTorch 转换为 Axon。

If there’s a model you want to use that’s available in PyTorch, you just need to find or implement an equivalent model in Axon and Bumblebee and import the model’s parameters. Bumblebee opens up a wide range of possible applications for the Elixir machine learning ecosystem.
如果您想要使用 PyTorch 中可用的模型,您只需在 Axon 和 Bumblebee 中找到或实现等效模型并导入模型的参数。 Bumblebee 为 Elixir 机器学习生态系统开辟了广泛的应用可能。

You can install Bumblebee from hex:
您可以从十六进制安装 Bumblebee:

{:bumblebee, "~> 0.1.0"}

        
          
        
      

With Bumblebee installed, you can make use of its high-level APIs. Typically, you’ll use Bumblebee to load models or tokenizers using Bumblebee.load_model/2 and Bumblebee.load_tokenizer/2:
安装 Bumblebee 后,您可以使用其高级 API。通常,您将使用 Bumblebee 使用 Bumblebee.load_model/2Bumblebee.load_tokenizer/2 加载模型或分词器:

{:ok, bert_spec} = Bumblebee.load_model({:hf, "bert-base-cased"})
{:ok, bert_tokenizer} = Bumblebee.load_tokenizer({:hf, "bert-base-cased"})

        
          
        
      

Loading Diffusion Models with Bumblebee
使用 Bumblebee 加载扩散模型

To make use of stable diffusion in Elixir, you’ll need to start by loading a few different models, tokenizers, and featurizers. Stable diffusion is actually a pipeline that makes use of four different models for different steps in the image generation process. Specifically, you need to load:
要在 Elixir 中使用稳定扩散,您需要首先加载一些不同的模型、分词器和特征化器。稳定扩散实际上是一个管道,它在图像生成过程的不同步骤中使用四种不同的模型。具体来说,你需要加载:

  1. A CLIP Text model 剪辑文本模型
  2. A CLIP Vision model CLIP Vision 模型
  3. A VAE model VAE模型
  4. A conditional U-NET model 有条件的 U-NET 模型
  5. A CLIP Vision featurizer CLIP Vision 特征化器
  6. A CLIP Text tokenizer CLIP 文本分词器
  7. A diffusion scheduler 扩散调度器

Now, you can fire up Livebook or a code editor and install the following dependencies:
现在,您可以启动 Livebook 或代码编辑器并安装以下依赖项:

Mix.install([
  {:bumblebee, "~> 0.1.0"},
  {:nx, "~> 0.4.0"},
  {:exla, "~> 0.4.0"},
  {:kino, "~> 0.8.0"}
])

        
          
        
      

Before loading any models, you’ll want to set EXLA as your default backend. Note that Stable Diffusion is an involved model. If you’d like to run on GPU you likely need a GPU with at least 10GB of memory. If you run on CPU, you’ll need to be patient as it may take very long to generate prompts:
在加载任何模型之前,您需要将 EXLA 设置为默认后端。请注意,稳定扩散是一个复杂的模型。如果你想在 GPU 上运行,你可能需要一个至少有 10GB 内存的 GPU。如果您在 CPU 上运行,则需要耐心等待,因为生成提示可能需要很长时间:

Nx.default_backend(EXLA.Backend)

        
          
        
      

Next, you can use Bumblebee’s loading primitives to load the models and featurizers you need:
接下来,您可以使用 Bumblebee 的加载原语来加载您需要的模型和特征:

{:ok, tokenizer} = Bumblebee.load_tokenizer({:hf, "openai/clip-vit-large-patch14"})

{:ok, clip} =
  Bumblebee.load_model(
    {:hf, "CompVis/stable-diffusion-v1-4", subdir: "text_encoder"}
  )

{:ok, vae} =
  Bumblebee.load_model(
    {:hf, "CompVis/stable-diffusion-v1-4", subdir: "vae"},
    architecture: :decoder,
    params_filename: "diffusion_pytorch_model.bin"
  )

{:ok, unet} =
  Bumblebee.load_model(
    {:hf, "CompVis/stable-diffusion-v1-4", subdir: "unet"},
    params_filename: "diffusion_pytorch_model.bin"
  )

{:ok, scheduler} =
  Bumblebee.load_scheduler(
    {:hf, "CompVis/stable-diffusion-v1-4", subdir: "scheduler"}
  )

{:ok, featurizer} =
  Bumblebee.load_featurizer(
    {:hf, "CompVis/stable-diffusion-v1-4", subdir: "feature_extractor"}
  )

{:ok, safety_checker} =
  Bumblebee.load_model(
    {:hf, "CompVis/stable-diffusion-v1-4", subdir: "safety_checker"}
  )

        
          
        
      

For each model, scheduler, featurizer, or tokenizer, you just need to make use of the correct Bumblebee API. Typically, you need to specify a repo and path for each model you want to load. The tuple {:hf, "CompVis/stable-diffusion-v1-4"} tells Bumblebee to look for the repository "CompVis/stable-diffusion-v1-4" in the HuggingFace hub (as indicated by :hf). The stable diffusion repository is actually a collection of several models, so for each separate model you need to specify a subdirectory to check out.
对于每个模型、调度器、特征化器或分词器,您只需要使用正确的 Bumblebee API。通常,您需要为要加载的每个模型指定一个存储库和路径。元组 {:hf, "CompVis/stable-diffusion-v1-4"} 告诉 Bumblebee 在 HuggingFace hub 中查找存储库 "CompVis/stable-diffusion-v1-4" (如 :hf 所示)。 stable diffusion repository 实际上是几个模型的集合,所以对于每个单独的模型,你需要指定一个子目录来检出。

Generating Images 生成图像

With your model loaded, you can start to generate images using Bumblebee’s diffusion API. At the time of this writing, Bumblebee only implements stable diffusion as an API; however, it’s entirely feasible for Bumblebee to support other forms of diffusion.
加载模型后,您可以开始使用 Bumblebee 的扩散 API 生成图像。在撰写本文时,Bumblebee 仅作为 API 实现稳定扩散;但是,Bumblebee 支持其他形式的扩散是完全可行的。

The stable diffusion API is exposed through the Bumblebee.Diffusion.StableDiffusion.text_to_image/6 function. The function takes as input all of the models loaded in the previous section, a prompt to generate from, and options to control the generation:
稳定的扩散 API 通过 Bumblebee.Diffusion.StableDiffusion.text_to_image/6 函数公开。该函数将上一节中加载的所有模型、生成提示以及控制生成的选项作为输入:

stable_diffusion_serving =
  Bumblebee.Diffusion.StableDiffusion.text_to_image(clip, unet, vae, tokenizer, scheduler,
    num_steps: 20,
    num_images_per_prompt: 2,
    safety_checker: safety_checker,
    safety_checker_featurizer: featurizer,
    defn_options: [compiler: EXLA]
  )

        
          
        
      

The diffusion function actually returns an %Nx.Serving struct, which is a high-level API intended for use in deployment pipelines. The serving takes care of things like pre and post-processing for you, so you can work directly with high-level inputs and outputs.
扩散函数实际上返回一个 %Nx.Serving 结构,这是一个旨在用于部署管道的高级 API。服务会为您处理诸如预处理和后处理之类的事情,因此您可以直接使用高级输入和输出。

This serving takes as input a string prompt or a map with keys :prompt and :negative_prompt. Negative prompts are prompts you want the diffusion process to ignore or steer away from. You can run your diffusion model with:
此服务将字符串提示或带有键 :prompt:negative_prompt 的映射作为输入。否定提示是您希望扩散过程忽略或避开的提示。您可以使用以下方法运行扩散模型:

output = Nx.Serving.run(stable_diffusion_serving, %{
  prompt: "narwhal, on a dock, computer, friendly, digital art",
  negative_prompt: "dark, foggy"
})

        
          
        
      

Stable diffusion prompts are different from DALL-E prompts in that it’s better to provide the prompt more as a list of attributes rather than coherent text. Feel free to change this prompt to something more open ended.
稳定扩散提示与 DALL-E 提示的不同之处在于,最好将提示更多地作为属性列表而不是连贯的文本提供。请随意将此提示更改为更开放的内容。

In addition to the models, prompt, and negative prompt, there are a few options that control the generated output. Most notable is the num_steps parameter, which controls the number of diffusion steps used during the process. More steps will lead to higher quality images; however, the generation process will be significantly slower.
除了模型、提示和否定提示之外,还有一些选项可以控制生成的输出。最值得注意的是 num_steps 参数,它控制过程中使用的扩散步骤数。更多的步骤将导致更高质量的图像;但是,生成过程会明显变慢。

The stable diffusion pipeline is somewhat slow even with a small number of steps, so you should be mindful to trade off compute for generated quality.
即使步骤很少,稳定的扩散管道也有些慢,因此您应该注意权衡计算以换取生成的质量。

outputs will take the form of a map of tensors which represent images. In order to save it to visualize the actual image, you can use Kino.Image:
outputs 将采用表示图像的张量映射的形式。为了保存它以可视化实际图像,您可以使用 Kino.Image

for result <- output.results do
  Kino.Image.new(result.image)
end
|> Kino.Layout.grid(columns: 2)

        
          
        
      

This will convert each image to a PNG and render them to the screen. You can save them from there.
这会将每个图像转换为 PNG 并将它们渲染到屏幕上。您可以从那里保存它们。

You should note that the generation process will take a bit of time, but eventually you should see output PNGs. Because the diffusion process is sensitive to randomness, your results may vary. For example, for this prompt, the pipeline generated the following images:
您应该注意到生成过程需要一些时间,但最终您应该会看到输出 PNG。由于扩散过程对随机性很敏感,因此您的结果可能会有所不同。例如,对于此提示,管道生成了以下图像:

An abstract image on a laptop screen that looks vaguely like a narwhal

A shape on water that resembles the body of a pink dolphin, with an appendage perpendicular to the body at the far right edge of the frame

Moving Forward 向前进

While this is a relatively simple example, it should open your eyes to the possibilities with Bumblebee.
虽然这是一个相对简单的示例,但它应该会让您对 Bumblebee 的可能性大开眼界。

For example, there are numerous variants of stable diffusion specialized on very specific use cases. Most, if not all, of these can be used directly from Elixir with Bumblebee. Combined with some of Elixir, Nx, and Axon’s high-level deployment capabilities, you can have a production application backed by stable diffusion in just a few minutes.
例如,有许多稳定扩散的变体专门针对非常具体的用例。大多数(如果不是全部)这些都可以直接从 Elixir 与 Bumblebee 一起使用。结合 Elixir、Nx 和 Axon 的一些高级部署功能,您可以在短短几分钟内拥有一个由稳定扩散支持的生产应用程序。

Additionally, despite the length of this post, you can actually implement this same exact pipeline using Kino’s built-in stable diffusion smart cell, without needing to write any code.
此外,尽管这篇文章很长,但您实际上可以使用 Kino 的内置稳定扩散智能单元实现完全相同的管道,而无需编写任何代码。

Before concluding, I need to give a shoutout and thank you to Jonatan Klosko, who architected much of Bumblebee’s APIs, conversion functionality, and more. Without Jonatan, there would be no stable diffusion in Elixir (or Livebook!).
在结束之前,我需要对 Jonatan Klosko 表示感谢,他设计了 Bumblebee 的大部分 API、转换功能等。没有 Jonatan,Elixir(或 Livebook!)就没有稳定的传播。

Until next time :)
直到下一次 :)

Search and Clustering with ExFaiss
使用 ExFaiss 进行搜索和聚类

Groups of chess pawns sorted into various circles, some of which intersect and some of which don't, on a dark surface

Machine Learning Advisor
机器学习顾问

Sean Moriarity  肖恩·莫里亚蒂

Introduction 介绍

In one of my previous posts, Semantic Search with Phoenix, Axon, and Elastic, I described a solution for integrating machine-learning-based vector search into your existing applications using Elasticsearch.
在我之前的一篇博文“使用 Phoenix、Axon 和 Elastic 进行语义搜索”中,我描述了一种使用 Elasticsearch 将基于机器学习的矢量搜索集成到现有应用程序中的解决方案。

While Elasticsearch is a powerful tool for full-text search, its vector-search support is much newer and lacks the features of dedicated vector-search tools. Most notably, Elasticsearch does not support GPU-based similarity search. GPU-based algorithms can often be orders of magnitude faster than their CPU counterparts. Additionally, Elasticsearch dense-vector indices offer less control and fewer features than other vector index libraries, which can hinder your ability to build similarity search applications at scale.
虽然 Elasticsearch 是一个强大的全文搜索工具,但它的矢量搜索支持要新得多,并且缺乏专用矢量搜索工具的功能。最值得注意的是,Elasticsearch 不支持基于 GPU 的相似性搜索。基于 GPU 的算法通常比对应的 CPU 算法快几个数量级。此外,与其他矢量索引库相比,Elasticsearch 密集矢量索引提供的控制和功能更少,这可能会阻碍您大规模构建相似性搜索应用程序的能力。

Faiss which stands for (F)acebook (AI) (S)imilarity (S)earch, is a library “for efficient similarity search and clustering of dense vectors.” Faiss offers extensive customization of indices, as well as GPU-based algorithms, and has been proven to work at scales of 1 trillion vectors. Faiss natively supports Python and C++, but now you can take advantage of Faiss in Elixir with ExFaiss.
Faiss 代表 (F)acebook (AI) (S)imilarity (S)search,是一个“用于高效相似性搜索和密集向量聚类”的库。 Faiss 提供广泛的索引定制以及基于 GPU 的算法,并已被证明可以在 1 万亿向量的规模上工作。 Faiss 原生支持 Python 和 C++,但现在您可以通过 ExFaiss 在 Elixir 中利用 Faiss。

What is Vector Search? 什么是矢量搜索?

Vector search is a type of search built on Approximate Nearest Neighbors (ANN) algorithms that find candidate vectors nearest to query vectors in some search space. Most often the vectors in a vector search are a machine-learning-derived, embedded representation of unstructured inputs such as text or images.
向量搜索是一种建立在近似最近邻 (ANN) 算法之上的搜索,该算法在某些搜索空间中找到最接近查询向量的候选向量。向量搜索中的向量通常是机器学习衍生的、嵌入的非结构化输入表示,例如文本或图像。

A typical vector search workflow looks something like this (credit Elastic):
一个典型的矢量搜索工作流程看起来像这样(信用 Elastic):
A diagram demonstrating how vector searches function

Vector search assumes you have a model that suitably embeds an unstructured input into a structured vector representation, where similar unstructured inputs would also map to similarly structured vectors. Mathematically speaking, “similar” just means vectors lie close together in space. You can kind of think of embedding models as generalizations of classification models; however, rather than trying to map an input to a discrete output label, embedding models attempt to map inputs to a unique, high-dimensional, and continuous representation.
向量搜索假设您有一个模型可以将非结构化输入适当地嵌入到结构化向量表示中,其中类似的非结构化输入也会映射到类似的结构化向量。从数学上讲,“相似”只是意味着向量在空间中靠得很近。您可以将嵌入模型视为分类模型的泛化;然而,嵌入模型不是尝试将输入映射到离散输出标签,而是尝试将输入映射到唯一、高维和连续的表示。

Visually speaking, classification problems slice up the output space into discrete buckets. Imagine a model that classifies images of various animals. One bucket represents cats, one bucket represents dogs, another represents birds, and so on. If you were doing this process manually, you would likely have a pile of images for each animal.
从视觉上讲,分类问题将输出空间分割成离散的桶。想象一个对各种动物的图像进行分类的模型。一个桶代表猫,一个桶代表狗,另一个桶代表鸟,等等。如果您手动执行此过程,您可能会为每只动物准备一堆图像。

Now, imagine that you wanted to break the process up even further–not only do you want to classify animals, but you also want to classify breeds within animal classes. Your dog bucket gets further divided into various breeds like Golden Retriever, German Shepherd, and more–as do all your other buckets. Visually you now have a smaller bucket within a larger bucket that represents breeds.
现在,假设您想要进一步分解该过程 —— 您不仅想要对动物进行分类,而且还想要对动物类别中的品种进行分类。您的狗桶会进一步划分为各种品种,例如金毛猎犬、德国牧羊犬等 —— 您的所有其他桶也是如此。在视觉上,您现在在代表品种的较大桶中有一个较小的桶。

You can continue this process of dissecting buckets into even smaller buckets based on any number of animal features until you have a unique “bucket” for each image in your dataset. If you assign a label or value to each bucket, you end up with a unique encoding that represents each image. This representation maps to some point in space, where animals with more similar features are closer together (in many of the same buckets), and others are farther apart.
你可以继续这个过程,根据任意数量的动物特征将桶分解成更小的桶,直到你的数据集中的每个图像都有一个独特的“桶”。如果您为每个桶分配一个标签或值,您最终会得到一个代表每个图像的唯一编码。该表示映射到空间中的某个点,具有更多相似特征的动物靠得更近(在许多相同的桶中),而其他动物则相距更远。

With suitable embedded representations, you can use an algorithm to determine which vectors are most similar (i.e. closest in space). With small indices, you can compute a distance metric (such as L1 or L2 norm) from a query vector to every other vector in the index and return the top k candidate vectors with the smallest distance metric.
通过合适的嵌入式表示,您可以使用一种算法来确定哪些向量最相似(即空间上最接近)。使用小索引,您可以计算从查询向量到索引中每个其他向量的距离度量(例如 L1 或 L2 范数),并返回具有最小距离度量的前 k 候选向量。

But for large indices, this kind of brute-force search isn’t possible. Instead, you need to rely on Approximate Nearest Neighbor (ANN) algorithms to efficiently prune the search space and return a “best guess” at which candidate vectors are closest to a query vector.
但对于大型索引,这种暴力搜索是不可能的。相反,您需要依靠近似最近邻 (ANN) 算法来有效地修剪搜索空间并返回候选向量最接近查询向量的“最佳猜测”。

With the rise of Transformer models, vector search is becoming more powerful, and more popular. Transformer models rely on representation learning and work well with multiple modalities. With transformers, it’s easy to extract a valuable representation from any type of data. Combined with vector search tools, you can rapidly build applications with integrated machine-learning-powered search and retrieval.
随着 Transformer 模型的兴起,向量搜索变得越来越强大,也越来越流行。 Transformer 模型依赖于表示学习,并且适用于多种模式。使用转换器,可以轻松地从任何类型的数据中提取有价值的表示。结合矢量搜索工具,您可以使用集成的机器学习驱动的搜索和检索快速构建应用程序。

Vector Search with ExFaiss
使用 ExFaiss 进行矢量搜索

ExFaiss is an Elixir library that interfaces with Facebook’s Faiss. ExFaiss is a relatively low-level wrapper around Faiss that makes it possible to create and use Faiss resources from Elixir. Faiss (and ExFaiss by design) are not full-featured vector databases, but are instead frameworks for creating and using vector indices. You can read a little more about the distinction in this article from Pinecone.
ExFaiss 是一个与 Facebook 的 Faiss 交互的 Elixir 库。 ExFaiss 是 Faiss 的一个相对低级的包装器,它使得从 Elixir 创建和使用 Faiss 资源成为可能。 Faiss(和设计上的 ExFaiss)不是功能齐全的矢量数据库,而是用于创建和使用矢量索引的框架。您可以在 Pinecone 的这篇文章中阅读更多关于区别的信息。

What this means is that ExFaiss is intentionally slim, and makes no assumptions about your intended use of the library. Production use cases will have a number of additional infrastructure requirements. I will discuss some deployment considerations later on in this post.
这意味着 ExFaiss 是有意精简的,并且不会对您对该库的预期用途做出任何假设。生产用例将有许多额外的基础设施要求。我将在本文后面讨论一些部署注意事项。

Before starting, you’ll want to fire up a Livebook and install the following dependencies:
在开始之前,您需要启动 Livebook 并安装以下依赖项:

Mix.install([
  {:ex_faiss, path: "elixir-nx/ex_faiss,
  {:nx, "~> 0.4"},
  {:exla, "~> 0.4"}
])

        
          
        
      
==> ex_faiss
make: '/home/sean/.cache/mix/installs/elixir-1.15.0-dev-erts-13.0.4/fa333a13bd85942bd7ed9d07a396ad43/_build/dev/lib/ex_faiss/priv/libex_faiss.so' is up to date.
Compiling 1 file (.ex)
:ok

Creating an Index 创建索引

The fundamental object in Faiss is the index object, which encapsulates a vector index. Faiss actually supports a wide range of index types with fine-grained customization options. In ExFaiss, you can create an index using ExFaiss.Index.new/3:
Faiss 中的基础对象是索引对象,它封装了一个向量索引。 Faiss 实际上支持范围广泛的索引类型以及细粒度的自定义选项。在 ExFaiss 中,您可以使用 ExFaiss.Index.new/3 创建索引:

index = ExFaiss.Index.new(512, "Flat")

        
          
        
      
%ExFaiss.Index{dim: 512, ref: #Reference<0.3614244527.2391146505.202990>, device: :host}

ExFaiss.Index.new/3 creates a new vector index with the given dimensionality and description. The description is a string that matches the spec described in The index factory. You can find out more about the different types of indices and modifiers from there. The Flat index is the most basic kind of index. It just performs a brute force search. When building your application, you should read through Guidelines to choose an index to determine which index works for you.
ExFaiss.Index.new/3 使用给定的 dimensionalitydescription 创建一个新的向量索引。描述是一个字符串,与索引工厂中描述的规范相匹配。您可以从那里找到更多关于不同类型的索引和修饰符的信息。 Flat 索引是最基本的一种索引。它只是执行蛮力搜索。在构建您的应用程序时,您应该通读指南以选择一个索引以确定哪个索引适合您。

The dimensionality of an index just indicates what the size of each of the vectors in the index must be. In this case, each vector must have a dimensionality of 512. ExFaiss.Index.new/3 also takes options for modifying the index. For example, you can change the distance metric from the default :l2 to one of a number of supported metrics:
索引的 dimensionality 仅指示索引中每个向量的大小必须是多少。在这种情况下,每个向量的维度必须为 512ExFaiss.Index.new/3 还采用修改索引的选项。例如,您可以将距离度量从默认的 :l2 更改为许多受支持的度量之一:

index = ExFaiss.Index.new(512, "Flat", metric: :l1)

        
          
        
      
%ExFaiss.Index{dim: 512, ref: #Reference<0.3614244527.2391146505.203005>, device: :host}

In addition to changing the metric, you can also change the device the index resides on. For example, if you have an Nvidia GPU and want to take advantage of GPU acceleration, you can pass device: :cuda:
除了更改指标外,您还可以更改索引所在的设备。例如,如果你有一个 Nvidia GPU 并且想利用 GPU 加速,你可以通过 device: :cuda

index = ExFaiss.Index.new(512, "Flat", device: :cuda)

        
          
        
      

On first run, it may take some time to initialize a GPU index—this was an issue I ran into. If you just wait it out, eventually you’ll get an index back.
首次运行时,初始化 GPU 索引可能需要一些时间 —— 这是我遇到的一个问题。如果你只是等待,最终你会得到一个索引。

If you have multiple GPUs, you can target specific GPUs on your machine by passing a tuple:
如果你有多个 GPU,你可以通过传递一个元组来定位你机器上的特定 GPU:

index = ExFaiss.Index.new(512, "Flat", device: {:cuda, 1})

        
          
        
      

Faiss also supports transparent multi-device indices; however, this is not yet supported in ExFaiss.
Faiss 还支持透明的多设备索引;但是,这在 ExFaiss 中尚不支持。

Adding Vectors to an Index
将向量添加到索引

After creating an index, you can add vectors in arbitrary batches. ExFaiss integrates directly with Nx, so you can add vectors using Nx tensors:
创建索引后,您可以任意批量添加向量。 ExFaiss 直接与 Nx 集成,因此您可以使用 Nx 张量添加向量:

index = ExFaiss.Index.add(index, Nx.random_normal({512}))

        
          
        
      
%ExFaiss.Index{dim: 512, ref: #Reference<0.3614244527.2391146505.203005>, device: :host}

If you query the index for it’s current size, you’ll see you now have 1 vector in the index:
如果您查询 index 的当前大小,您会看到索引中现在有 1 个向量:

ExFaiss.Index.get_num_vectors(index)

        
          
        
      
1

You can also add tensors to the index in batches, which will typically be more efficient than adding one vector at a time:
您还可以将张量分批添加到索引中,这通常比一次添加一个向量更有效:

index = ExFaiss.Index.add(index, Nx.random_normal({32, 512}))

        
          
        
      
%ExFaiss.Index{dim: 512, ref: #Reference<0.3614244527.2391146505.203005>, device: :host}

Now your index will have 33 vectors:
现在你的索引将有 33 个向量:

ExFaiss.Index.get_num_vectors(index)

        
          
        
      
33

ExFaiss.Index.add/2 requires your tensor to have a rank of 1 or 2, with the final dimension being equal to the number of dimensions specified in your index. Additionally, all vectors must have a type of {:f, 32}. If your tensor doesn’t meet these requirements, ExFaiss will raise:
ExFaiss.Index.add/2 要求张量的秩为 1 或 2,最终维度等于索引中指定的维度数。此外,所有向量的类型都必须是 {:f, 32} 。如果您的张量不满足这些要求,ExFaiss 将提出:

ExFaiss.Index.add(index, Nx.random_normal({128}))

        
          
        
      
ExFaiss.Index.add(index, Nx.iota({512}))

        
          
        
      

Note that while each call to ExFaiss.Index.add/2 returns a new index struct, the underlying reference is immutable. That means successive calls will modify the same index struct:
请注意,虽然每次调用 ExFaiss.Index.add/2 都会返回一个新的索引结构,但底层引用是不可变的。这意味着连续调用将修改相同的索引结构:

ExFaiss.Index.add(index, Nx.random_normal({32, 512}))

        
          
        
      
%ExFaiss.Index{dim: 512, ref: #Reference<0.3614244527.2391146505.203005>, device: :host}

Even though I haven’t reassigned index, the contents of the underlying index have changed:
即使我没有重新分配索引,基础索引的内容也发生了变化:

ExFaiss.Index.get_num_vectors(index)

        
          
        
      
65

Searching an Index
搜索索引

With a populated index, you can query the index with 1 or more query vectors to find the top k vectors closest to your query vectors using ExFaiss.Index.search/3:
使用填充的索引,您可以使用 1 个或多个查询向量查询索引,以使用 ExFaiss.Index.search/3 找到最接近您的查询向量的顶部 k 向量:

query = Nx.random_normal({512})

ExFaiss.Index.search(index, query, 5)

        
          
        
      
%{
  distances: #Nx.Tensor<
    f32[1][5]
    [
      [539.4435424804688, 541.6113891601562, 544.9053344726562, 557.1708374023438, 557.5189819335938]
    ]
  >,
  labels: #Nx.Tensor<
    s64[1][5]
    [
      [38, 3, 2, 39, 37]
    ]
  >
}

ExFaiss.Index.search/3 returns a map with keys :distances and :labels. :distances is a vector of size {batch, k} where batch is equal to the number of query vectors and k is equal to the k specified in the search. k indicates the number of candidate vectors that should return from the search. :distances represents the distance of each returned vector from the corresponding query vector according to the distance metric you specified when creating the index. :labels are integer labels for each of the candidate vectors in your index. By default :labels are assigned in sequential, increasing order.
ExFaiss.Index.search/3 返回带有键 :distances:labels 的映射。 :distances 是大小为 {batch, k} 的向量,其中 batch 等于查询向量的数量, k 等于搜索中指定的 kk 表示应该从搜索中返回的候选向量的数量。 :distances 表示根据你在创建索引时指定的距离度量,每个返回的向量与对应查询向量的距离。 :labels 是索引中每个候选向量的整数标签。默认情况下, :labels 是按顺序递增分配的。

If you’re using ExFaiss in conjunction with a structured database like Postgres, it might be more convenient for you to map these labels directly to Postgres primary keys. Faiss supports indices with user-provided integer IDs. All you need to do is create a new index with IDMap modifier:
如果您将 ExFaiss 与 Postgres 等结构化数据库结合使用,将这些标签直接映射到 Postgres 主键可能更方便。 Faiss 支持具有用户提供的整数 ID 的索引。您需要做的就是使用 IDMap 修饰符创建一个新索引:

index = ExFaiss.Index.new(128, "IDMap,Flat")

        
          
        
      
%ExFaiss.Index{dim: 128, ref: #Reference<0.3614244527.2391146500.201520>, device: :host}

Then you can use ExFaiss.Index.add_with_ids/3 to pass both vectors and integer IDs to the index:
然后,您可以使用 ExFaiss.Index.add_with_ids/3 将向量和整数 ID 都传递给索引:

index = ExFaiss.Index.add_with_ids(index, Nx.random_normal({5, 128}), Nx.tensor([1, 5, 7, 9, 11]))

        
          
        
      
%ExFaiss.Index{dim: 128, ref: #Reference<0.3614244527.2391146500.201520>, device: :host}

Now when you search, your labels will map directly to the IDs you specified when adding vectors to your index:
现在,当您搜索时,您的标签将直接映射到您在将向量添加到索引时指定的 ID:

ExFaiss.Index.search(index, Nx.random_normal({128}), 2)

        
          
        
      
%{
  distances: #Nx.Tensor<
    f32[1][2]
    [
      [250.86492919921875, 260.0044860839844]
    ]
  >,
  labels: #Nx.Tensor<
    s64[1][2]
    [
      [9, 7]
    ]
  >
}

Training an Index
训练索引

Most of the indices you’ll work with require training. This means that you must provide a “representative sample” of input data to train the index before you can query it. You can do this with ExFaiss.Index.train/3:
您将使用的大多数指数都需要培训。这意味着您必须提供输入数据的“代表性样本”来训练索引,然后才能查询它。你可以用 ExFaiss.Index.train/3 来做到这一点:

index =
  ExFaiss.Index.new(10, "HNSW,Flat")
  |> ExFaiss.Index.train(Nx.random_normal({100, 10}))
  |> ExFaiss.Index.add(Nx.random_normal({100, 10}))

ExFaiss.Index.search(index, Nx.random_normal({10}), 3)

        
          
        
      
%{
  distances: #Nx.Tensor<
    f32[1][3]
    [
      [4.295245170593262, 5.3970627784729, 6.233855247497559]
    ]
  >,
  labels: #Nx.Tensor<
    s64[1][3]
    [
      [63, 11, 62]
    ]
  >
}

Persistence 坚持

ExFaiss indices are stored in-memory; however, you can persist them to disk and reload them from disk using ExFaiss.Index.to_file/from_file:
ExFaiss 索引存储在内存中;但是,您可以将它们保存到磁盘并使用 ExFaiss.Index.to_file/from_file 从磁盘重新加载它们:

:ok = ExFaiss.Index.to_file(index, "index.bin")
index = ExFaiss.Index.from_file("index.bin", 0)

        
          
        
      
%ExFaiss.Index{dim: 10, ref: #Reference<0.3614244527.2391146529.201916>, device: nil}

You can then search and use the index as normal:
然后您可以正常搜索和使用索引:

ExFaiss.Index.search(index, Nx.random_normal({10}), 3)

        
          
        
      
%{
  distances: #Nx.Tensor<
    f32[1][3]
    [
      [2.422379493713379, 4.168035507202148, 4.876709938049316]
    ]
  >,
  labels: #Nx.Tensor<
    s64[1][3]
    [
      [82, 71, 73]
    ]
  >
}

Clustering with ExFaiss 使用 ExFaiss 进行集群

In addition to similarity search, Faiss supports efficient clustering of dense vectors. Clustering is the process of grouping data in an unsupervised manner (e.g. without needing access to labels). If you know nothing else about a dataset, you can use clustering to get access to groups of similar data in the dataset.
除了相似性搜索,Faiss 还支持密集向量的高效聚类。聚类是以无监督方式对数据进行分组的过程(例如,无需访问标签)。如果您对数据集一无所知,则可以使用聚类来访问数据集中的相似数据组。

For example, if you have a dataset that represents customer behavior on your website, you can use clustering to gain insights to the “types” of users you have (e.g. browsers or purchasers).
例如,如果您有一个代表您网站上客户行为的数据集,您可以使用聚类来深入了解您拥有的用户“类型”(例如浏览器或购买者)。

Clusterings in ExFaiss are created with ExFaiss.Clustering.new/3:
ExFaiss 中的集群是使用 ExFaiss.Clustering.new/3 创建的:

clustering = ExFaiss.Clustering.new(128, 10)

        
          
        
      
%ExFaiss.Clustering{
  ref: #Reference<0.3614244527.2391146520.201358>,
  k: 10,
  index: %ExFaiss.Index{dim: 128, ref: #Reference<0.3614244527.2391146500.201547>, device: :host},
  trained?: nil
}

Then, you can train the cluster on your dataset:
然后,您可以在您的数据集上训练集群:

clustering = ExFaiss.Clustering.train(clustering, Nx.random_normal({400, 128}))

        
          
        
      
%ExFaiss.Clustering{
  ref: #Reference<0.3614244527.2391146520.201358>,
  k: 10,
  index: %ExFaiss.Index{dim: 128, ref: #Reference<0.3614244527.2391146500.201547>, device: :host},
  trained?: true
}

Finally, you can query the clustering for specific cluster assignments for data points:
最后,您可以查询聚类以获取数据点的特定聚类分配:

ExFaiss.Clustering.get_cluster_assignment(clustering, Nx.random_normal({128}))

        
          
        
      
%{
  distances: #Nx.Tensor<
    f32[1][1]
    [
      [107.08744812011719]
    ]
  >,
  labels: #Nx.Tensor<
    s64[1][1]
    [
      [8]
    ]
  >
}

:distances indicates the distance of your query vector from the centroid of the cluster, and :labels is the cluster label. Under the hood, clustering actually maintains an index with k vectors where k is the number of clusters you specified when creating the cluster:
:distances 表示查询向量与簇质心的距离, :labels 是簇标签。在引擎盖下,集群实际上维护一个带有 k 向量的索引,其中 k 是您在创建集群时指定的集群数:

ExFaiss.Index.get_num_vectors(clustering.index)

        
          
        
      
10

You can also get the centroids of the clustering:
您还可以获得聚类的质心:

ExFaiss.Clustering.get_centroids(clustering)

        
          
        
      
#Nx.Tensor<
  f32[10][128]
  [
    [0.023990295827388763, 0.15066096186637878, 0.21808193624019623, 0.142413929104805, 0.18317341804504395, -0.07162952423095703, 0.3815433084964752, -0.16259776055812836, -0.43733739852905273, 0.11409493535757065, -0.14631769061088562, 0.2587531805038452, 0.11657790839672089, -0.12204591184854507, 0.09586010128259659, 0.0476464107632637, 0.13613013923168182, 0.3184964656829834, 0.05925193801522255, -0.060553018003702164, -0.10446067899465561, -0.06084907427430153, 0.4304802119731903, 0.2743024230003357, -0.12481584399938583, -0.10227929800748825, -0.06651915609836578, 0.11950135976076126, -0.12169258296489716, -0.13921332359313965, -0.14966388046741486, 0.28956955671310425, 0.12099193781614304, 0.023360300809144974, -0.2539049983024597, -0.25003859400749207, -0.45176732540130615, 0.17149312794208527, 0.1296563446521759, 0.10130268335342407, 0.214811772108078, -0.29942402243614197, -0.3588094115257263, 0.20960432291030884, 0.18304313719272614, 0.4666171967983246, -0.060203924775123596, 0.32712119817733765, -0.05794418603181839, 0.3709704875946045, ...],
    ...
  ]
>

ExFaiss in Your Applications
应用程序中的 ExFaiss

There are many considerations you should keep in mind when integrating ExFaiss into your application. These are just a few:
将 ExFaiss 集成到您的应用程序时,您应该牢记许多注意事项。这些才一点点:

  • Faiss (and as a result ExFaiss) doesn’t support indexing of non-vector data like text, usernames, etc. It’s likely you’ll need to combine ExFaiss with a relational database like Postgres for most legitimate applications. Vector databases like Milvus provide a good reference for the features necessary for building production-ready vector-search applications.
    Faiss(因此 ExFaiss)不支持非矢量数据的索引,如文本、用户名等。对于大多数合法应用程序,您可能需要将 ExFaiss 与 Postgres 等关系数据库结合使用。像 Milvus 这样的矢量数据库为构建生产就绪的矢量搜索应用程序所需的功能提供了很好的参考。

  • ExFaiss also does not automatically persist indices. If your application crashes, and you don’t explicitly back up your indices to disk, you will lose all of the data associated with an index. Fortunately, with Elixir, it’s easy enough to write jobs that continuously back up your indices.
    ExFaiss 也不会自动持久化索引。如果您的应用程序崩溃,并且您没有明确地将索引备份到磁盘,您将丢失与索引关联的所有数据。幸运的是,使用 Elixir 可以很容易地编写持续备份索引的作业。

  • ExFaiss indices on both CPU and GPU are not thread-safe for mutable operations such as training and adding values to the index. On the GPU, none of the operations in an ExFaiss indices is thread-safe.
    CPU 和 GPU 上的 ExFaiss 索引对于可变操作(例如训练和向索引添加值)不是线程安全的。在 GPU 上,ExFaiss 索引中的所有操作都不是线程安全的。

  • It’s faster to search in batches rather than for single query vectors at a time. You’ll want to consider using dynamic batching to ensure overlapping requests get queued for searching at the same time.
    批量搜索比一次搜索单个查询向量更快。您需要考虑使用动态批处理来确保重叠的请求同时排队等待搜索。

  • Depending on your setup, ExFaiss might compete for resources with other computationally expensive models (like neural network inference). You’ll want to avoid launching these computations in parallel or launching multiple searches in parallel. Fortunately, the benefit of using ExFaiss in conjunction with Axon is that you have fine-grained control over how requests in your search application get handled in batches.
    根据您的设置,ExFaiss 可能会与其他计算量大的模型(如神经网络推理)竞争资源。您需要避免并行启动这些计算或并行启动多个搜索。幸运的是,将 ExFaiss 与 Axon 结合使用的好处是您可以细粒度地控制搜索应用程序中的请求如何批量处理。

Conclusion 结论

I hope this post served as a good introduction to vector search and ExFaiss. There are numerous applications of machine-learning-based similarity search in the real world–and now you can build ML-powered search applications entirely in Elixir.
我希望这篇文章能很好地介绍矢量搜索和 ExFaiss。在现实世界中有许多基于机器学习的相似性搜索应用程序 —— 现在您可以完全在 Elixir 中构建基于机器学习的搜索应用程序。

In my next post, we’ll rework the example in my previous post on search to make use of ExFaiss, LiveView, and some other exciting new libraries in the Elixir machine learning ecosystem.
在我的下一篇文章中,我们将修改我之前关于搜索的文章中的示例,以利用 ExFaiss、LiveView 和 Elixir 机器学习生态系统中的其他一些令人兴奋的新库。

Semantic Search with Phoenix, Axon, Bumblebee, and ExFaiss
使用 Phoenix、Axon、Bumblebee 和 ExFaiss 进行语义搜索

Several shelves of different wine bottles

Machine Learning Advisor
机器学习顾问

Sean Moriarity  肖恩·莫里亚蒂

Introduction 介绍

In my previous post, Semantic Search with Phoenix, Axon, and Elastic, I detailed how you can use Elixir’s machine learning libraries to create a semantic search tool capable of pairing users with wines based on natural language descriptions.
在我之前的帖子“使用 Phoenix、Axon 和 Elastic 进行语义搜索”中,我详细介绍了如何使用 Elixir 的机器学习库创建一个语义搜索工具,该工具能够根据自然语言描述将用户与葡萄酒配对。

Since that post was published, Elixir’s machine learning ecosystem has grown significantly with the introduction of the Bumblebee library. Bumblebee is a library that gives Elixir developers access to a variety of powerful pre-trained models available on HuggingFace.
自那篇文章发表以来,Elixir 的机器学习生态系统随着 Bumblebee 库的引入而显着发展。 Bumblebee 是一个库,它使 Elixir 开发人员可以访问 HuggingFace 上可用的各种强大的预训练模型。

Additionally, Nx recently introduced a serving capability designed for online deployment scenarios. Finally, I recently released a library called ExFaiss, which provides bindings to the powerful vector search library FAISS.
此外,Nx 最近还推出了专为在线部署场景设计的服务能力。最后,我最近发布了一个名为 ExFaiss 的库,它提供了与强大的矢量搜索库 FAISS 的绑定。

With these recent additions to the Elixir ecosystem, I thought it would be a good idea to update my previous post with the newest libraries available. For additional context, I suggest you read my original post on this topic. In this post, we’ll create a semantic search tool for wines using Phoenix, Axon, Bumblebee, and ExFaiss.
有了这些最近添加到 Elixir 生态系统中的内容,我认为用最新的可用库更新我之前的帖子是个好主意。有关其他上下文,我建议您阅读我关于该主题的原始帖子。在本文中,我们将使用 Phoenix、Axon、Bumblebee 和 ExFaiss 创建一个葡萄酒语义搜索工具。

Setting Up the Application
设置应用程序

Start by creating a new Phoenix application. I am using Phoenix 1.7:
首先创建一个新的 Phoenix 应用程序。我正在使用 Phoenix 1.7:

mix phx.new sommelier

        
          
        
      

Next, you’ll need to add the following dependencies to your application:
接下来,您需要将以下依赖项添加到您的应用程序中:

[
  ...
  {:bumblebee, "~> 0.1"},
  {:nx, "~> 0.4"},
  {:exla, "~> 0.4"},
  {:ex_faiss, github: "elixir-nx/ex_faiss"}
]

        
          
        
      

Then, run mix deps.get: 然后,运行 mix deps.get

mix deps.get

        
          
        
      

Finally, you’ll need to create your database:
最后,您需要创建数据库:

mix ecto.create

        
          
        
      

And you’re ready to get started!
你已经准备好开始了!

Setting up the Wine Resource
设置葡萄酒资源

In the original semantic search application, you didn’t need to use Ecto to manage wine documents because you used Elasticsearch for persistence. This time, without Elasticsearch, you’ll need an Ecto resource to persist information about wines. Run the following command to generate a new context and schema for wines:
在最初的语义搜索应用程序中,您不需要使用 Ecto 来管理 wine 文档,因为您使用 Elasticsearch 进行持久化。这一次,如果没有 Elasticsearch,您将需要一个 Ecto 资源来保存有关葡萄酒的信息。运行以下命令为 wines 生成新的上下文和模式:

mix phx.gen.context Wines Wine wines name:string url:string embedding:binary

        
          
        
      

For each wine, we’ll store the name, its URL on wine.com, and an embedding, which is a vector that mathematically captures semantic information about the wine. The embedding will be generated from a semantic similarity model.
对于每种葡萄酒,我们将存储名称、它在 wine.com 上的 URL 和一个嵌入,它是一个向量,以数学方式捕获有关葡萄酒的语义信息。嵌入将从语义相似性模型生成。

Make sure you run mix ecto.migrate to create the wine table:
确保运行 mix ecto.migrate 来创建酒表:

mix ecto.migrate

        
          
        
      

Creating the Embedding Pipeline
创建嵌入管道

Your semantic search application will take a natural language query, compute an embedded representation of the query using an Axon model, and then compare the embedded representation to existing representations of wines in an index.
您的语义搜索应用程序将采用自然语言查询,使用 Axon 模型计算查询的嵌入式表示,然后将嵌入式表示与索引中现有的葡萄酒表示进行比较。

Create a new file lib/sommelier/model.ex. This module will be responsible for the embedding pipeline you’ll use to embed natural language queries:
创建一个新文件 lib/sommelier/model.ex 。该模块将负责您将用于嵌入自然语言查询的嵌入管道:

defmodule Sommelier.Model do
end

        
          
        
      

In model.ex, create a new function called serving that looks like:
model.ex 中,创建一个名为 serving 的新函数,如下所示:

def serving() do
  {:ok, %{model: model, params: params}} = Bumblebee.load_model({:hf, "sentence-transformers/paraphrase-MiniLM-L6-v2"})
  {:ok, tokenizer} = Bumblebee.load_tokenizer({:hf, "sentence-transformers/paraphrase-MiniLM-L6-v2"})

  {_init_fn, predict_fn} = Axon.build(model, compiler: EXLA)

  Nx.Serving.new(fn ->
    fn %{size: size} = inputs ->
      inputs = Nx.Batch.pad(inputs, @batch_size - size)
      predict_fn.(params, inputs)[:pooled_state]
    end
  end)
  |> Nx.Serving.client_preprocessing(fn input ->
    inputs = Bumblebee.apply_tokenizer(tokenizer, texts,
      length: @sequence_length,
      return_token_type_ids: false
    )

    {Nx.Batch.concatenate([inputs]), :ok}
  end)
end

        
          
        
      

Next, add the following predict function to the module:
接下来,将以下 predict 函数添加到模块中:

def predict(text) do
  Nx.Serving.batched_run(SommelierModel, text)
end

        
          
        
      

Next, add the following to your application.ex:
接下来,将以下内容添加到您的 application.ex

...
{Nx.Serving,
  serving: Sommelier.Model.serving(),
  name: SommelierModel,
  batch_size: 8,
  batch_timeout: 100},
# Start the Endpoint (http/https)
SommelierWeb.Endpoint

        
          
        
      

This will create and start a new Nx.Serving, which will handle the pre-processing and model inference in batches behind the scenes to better use resources on the server. You can test that your serving works by starting your application:
这将创建并启动一个新的 Nx.Serving ,它将在后台批量处理预处理和模型推理,以更好地利用服务器上的资源。您可以通过启动您的应用程序来测试您的服务是否有效:

iex -S mix phx.server

        
          
        
      

And attempting to embed some text:
并尝试嵌入一些文本:

iex> Sommelier.Model.predict("a nice red wine")
[info] TfrtCpuClient created.
#Nx.Tensor<
  f32[1][384]
  EXLA.Backend<host:0, 0.1077614924.2375680020.62643>
  [
    [-0.02617456577718258, -8.819118374958634e-4, 0.05722760409116745, 0.12959082424640656, -0.1351461410522461, 0.020610297098755836, 0.005453622899949551, 0.1129845529794693, 0.005040481220930815, 0.041092704981565475, 0.0013414014829322696, 0.045418690890073776, 0.12092263251543045, -0.050827134400606155, -0.01729273609817028, 0.14232997596263885, 0.19483818113803864, 0.032853033393621445, -0.09650719165802002, 0.11645855009555817, 0.01761060580611229, -0.026606624945998192, 0.009240287356078625, -0.05202469229698181, 0.010420262813568115, 0.1607143133878708, -0.03218967467546463, 0.024632470682263374, 0.03334266319870949, 0.03204822167754173, 0.012620541267096996, 0.022357983514666557, -0.05593165010213852, 0.02747185155749321, 0.030256617814302444, -0.08117566257715225, 0.08132530748844147, 0.11905942112207413, 0.014421811327338219, 0.06395658850669861, 0.06002272665500641, 0.06929747760295868, -0.10164055973291397, 0.14846278727054596, -0.019189205020666122, 0.04716624692082405, -0.17113839089870453, -0.01575590670108795, 0.02289806306362152, -0.09108022600412369, ...]
  ]
>

        
          
        
      

Creating the Vector Index
创建向量索引

In order to perform vector search, you need to create a vector search index using ExFaiss. You can read in depth about ExFaiss in my post here. Create a new module lib/sommelier/index.ex:
为了执行矢量搜索,您需要使用 ExFaiss 创建矢量搜索索引。您可以在我的帖子中深入阅读 ExFaiss。创建一个新模块 lib/sommelier/index.ex

defmodule Sommelier.Index do
end

        
          
        
      

Next, scaffold out a basic GenServer:
接下来,构建一个基本的 GenServer:

use GenServer

def start_link(_opts) do
  GenServer.start_link(__MODULE__, [], name: __MODULE__)
end

@impl true
def init(_opts \\ []) do
  index = ExFaiss.Index.new(384, "IDMap,Flat")
  {:ok, index}
end

        
          
        
      

When your GenServer starts, it will create a new Flat ExFaiss Index with dimensionality of 384. Next, add the following add client/server API to your GenServer:
当您的 GenServer 启动时,它将创建一个新的 Flat ExFaiss 索引,维度为 384。接下来,将以下 add 客户端/服务器 API 添加到您的 GenServer:

def add(id, embedding) do
  GenServer.cast(__MODULE__, {:add, id, embedding})
end

def handle_cast({:add, id, embedding}, index) do
  index = ExFaiss.Index.add_with_ids(index, embedding, id)
  {:noreply, index}
end

        
          
        
      

Then, add the following search client/server API to your GenServer:
然后,将以下 search 客户端/服务器 API 添加到您的 GenServer:

def search(embedding, k) do
  GenServer.call(__MODULE__, {:search, embedding, k})
end

def handle_call({:search, embedding, k}, _from, index) do
  results = ExFaiss.Index.search(index, embedding, k)
  {:reply, results, index}
end

        
          
        
      

Finally, add the Sommelier.Index to your supervision tree:
最后,将 Sommelier.Index 添加到你的监督树中:

[
  ...
  Sommelier.Index,
]

        
          
        
      

Now, you can test that your index is working properly by restarting your application and adding a few dummy embeddings to the index, and then searching:
现在,您可以通过重新启动应用程序并向索引添加一些虚拟嵌入,然后搜索来测试索引是否正常工作:

iex> embeds = Sommelier.Model.predict("a nice red wine")
iex> Sommelier.Index.add(Nx.tensor([0]), embeds)
iex> Sommelier.Index.search(embeds, 5)
%{
  distances: #Nx.Tensor<
    f32[1][5]
    [
      [0.0, 3.4028234663852886e38, 3.4028234663852886e38, 3.4028234663852886e38, 3.4028234663852886e38]
    ]
  >,
  labels: #Nx.Tensor<
    s64[1][5]
    [
      [0, -1, -1, -1, -1]
    ]
  >
}

        
          
        
      

Creating the Search Functionality and LiveView
创建搜索功能和 LiveView

With your basic search and embedding infrastructure in place, you can go about creating the search LiveView. Create a file lib/sommelier_web/search_live/index.ex:
有了基本的搜索和嵌入基础结构,您就可以着手创建搜索 LiveView。创建文件 lib/sommelier_web/search_live/index.ex

defmodule SommelierWeb.SearchLive.Index do
  use SommelierWeb, :live_view

  @impl true
  def mount(_params, _session, socket) do
    {:ok, assign(socket, :results, [])}
  end
end

        
          
        
      

Next, implement the following render function to render search results:
接下来,实现以下 render 函数来呈现搜索结果:

@impl true
def render(assigns) do
  ~H"""
  <div>
    <form name="wines-search" id="wines-search" phx-submit="search_for_wines">
      <label for="search" class="block text-sm font-medium text-gray-700">Quick search</label>
      <div class="relative mt-1 flex items-center">
        <input type="text" name="query" id="query" class="block w-full rounded-md border-gray-300 pr-12 shadow-sm focus:border-indigo-500 focus:ring-indigo-500 sm:text-sm">
        <div class="absolute inset-y-0 right-0 flex py-1.5 pr-1.5">
          <kbd class="inline-flex items-center rounded border border-gray-200 px-2 font-sans text-sm font-medium text-gray-400">⌘K</kbd>
        </div>
      </div>
    </form>
    <ul role="list" class="divide-y divide-gray-200">
      <li :for={result <- @results}>
        <p class="text-sm font-medium text-gray-900">
          <a href={result.url}><%= result.name %></a>
        </p>
      </li>
    </ul>
  </div>
  """
end

        
          
        
      

Next, implement handle_params/3 like this:
接下来,像这样实现 handle_params/3

@impl true
def handle_params(%{"q" => query}, _uri, socket) do
  results = Sommelier.Wines.search_wine(query)
  {:noreply, assign(socket, :results, results)}
end

def handle_params(_params, _uri, socket) do
  {:noreply, socket}
end

        
          
        
      

This will look for query parameters in the URL and use the query to search for wines in the database using the unimplemented search_wine/1 function. Finally, implement the following event handler to handle search submissions:
这将在 URL 中查找查询参数,并使用未实现的 search_wine/1 函数使用查询在数据库中搜索葡萄酒。最后,实现以下事件处理程序来处理搜索提交:

def handle_event("search_for_wines", %{"query" => query}, socket) do
  {:noreply, push_patch(socket, to: ~p"/search?q=#{query}")}
end

        
          
        
      

Next, you need to implement the actual search functionality in your wine context, like this:
接下来,您需要在您的 wine 上下文中实现实际的搜索功能,如下所示:

def search_wine(query) do
  embedding = Sommelier.Model.predict(query)
  %{labels: labels} = Sommelier.Index.search(embedding, 5)

  labels
  |> Nx.to_flat_list()
  |> get_wines()
end

def get_wines(ids) do
  from(w in Wine, where: w.id in ^ids) |> Repo.all()
end

        
          
        
      

Finally, add the following route to your router:
最后,将以下路由添加到您的路由器:

live "/search", SearchLive.Index, :index

        
          
        
      

Now if you navigate localhost:4000/search and type in a search, you’ll see the URL change, but no results! That’s because you haven’t actually added any wines to the database!
现在,如果您导航 localhost:4000/search 并输入搜索内容,您将看到 URL 发生变化,但没有结果!那是因为您实际上还没有向数据库中添加任何葡萄酒!

Seeding the Database 播种数据库

The wine dataset is based on the dataset from my original semantic search post. You can access the wine dataset from here. Download the document and move it to the priv directory of your sommelier project. Next, add the following to priv/repo/seeds.exs:
葡萄酒数据集基于我最初的语义搜索帖子中的数据集。您可以从此处访问葡萄酒数据集。下载文档并将其移动到侍酒师项目的 priv 目录。接下来,将以下内容添加到 priv/repo/seeds.exs

defmodule EmbedWineDocuments do
  def format_document(document) do
    "Name: #{document["name"]}\n" <>
      "Varietal: #{document["varietal"]}\n" <>
      "Location: #{document["location"]}\n" <>
      "Alcohol Volume: #{document["alcohol_volume"]}\n" <>
      "Alcohol Percent: #{document["alcohol_percent"]}\n" <>
      "Price: #{document["price"]}\n" <>
      "Winemaker Notes: #{document["notes"]}\n" <>
      "Reviews:\n#{format_reviews(document["reviews"])}"
  end

  defp format_reviews(reviews) do
    reviews
    |> Enum.map(fn review ->
      "Reviewer: #{review["author"]}\n" <>
        "Review: #{review["review"]}\n" <>
        "Rating: #{review["rating"]}"
    end)
    |> Enum.join("\n")
  end
end

"priv/wine_documents.jsonl"
|> File.stream!()
|> Stream.map(&Jason.decode!/1)
|> Stream.map(fn document ->
  desc = EmbedWineDocuments.format_document(document)
  embedding = Sommelier.Model.predict(desc)
  {document["name"], document["url"], embedding}
end)
|> Enum.each(fn {name, url, embedding} ->
  Sommelier.Wines.create_wine(%{"name" => name, "url" => url, "embedding" => Nx.to_binary(embedding)})
end)

        
          
        
      

Now, run mix run priv/repo/seeds.exs to add each wine to your database:
现在,运行 mix run priv/repo/seeds.exs 将每种葡萄酒添加到您的数据库中:

mix run priv/repo/seeds.exs

        
          
        
      

Note that this may run for a while depending on the machine you’re using.
请注意,这可能会运行一段时间,具体取决于您使用的机器。

Next, you need to ensure your database remains in sync with your wine index. You can do this by loading embeddings into the index on application startup. Adjust your init/1 function in Sommelier.Index to look like this:
接下来,您需要确保您的数据库与您的葡萄酒索引保持同步。您可以通过在应用程序启动时将嵌入加载到索引中来实现。将 Sommelier.Index 中的 init/1 函数调整为如下所示:

def init(_opts \\ []) do
  index = ExFaiss.Index.new(384, "IDMap,Flat")
  index =
    Sommelier.Wines.list_wines()
    |> Enum.reduce(index, fn wine, index ->
      embedding = wine.embedding
      id = wine.id
      ExFaiss.Index.add_with_ids(index, Nx.from_binary(embedding, :f32), Nx.tensor([id]))
    end)

  {:ok, index}
end

        
          
        
      

This will load embeddings from the database when your application starts. Note that there are better ways to do this (e.g. by persisting snapshots of your index with native Faiss IO); however, this works well for simplicity.
这将在您的应用程序启动时从数据库加载嵌入。请注意,有更好的方法可以做到这一点(例如,通过使用本机 Faiss IO 持久化索引的快照);但是,为了简单起见,这很有效。

Running the Search 运行搜索

With your database and index seeded with wines, restart your application and navigate to localhost:4000/search. Now, try running a few queries for wines. You’ll find that you can find excellent wine pairings just by describing what you’re looking for!
将数据库和索引植入 wines 后,重新启动应用程序并导航至 localhost:4000/search 。现在,尝试运行一些关于葡萄酒的查询。您会发现,只需描述您要寻找的东西,就可以找到出色的葡萄酒搭配!

Conclusion 结论

The Elixir ecosystem makes it easy to build machine-learning-enabled applications. This is a relatively simplistic example, but it’s still powerful! In about 15 minutes you have a working semantic search application, and you don’t need to use any external tools or services.
Elixir 生态系统可以轻松构建支持机器学习的应用程序。这是一个相对简单的例子,但它仍然很强大!大约 15 分钟后,您就有了一个可用的语义搜索应用程序,而且您不需要使用任何外部工具或服务。

Until next time! 直到下一次!

Audio Speech Recognition in Elixir with Whisper Bumblebee
使用 Whisper Bumblebee 在 Elixir 中进行音频语音识别

A bumblebee landing on a purple flower.

Machine Learning Advisor
机器学习顾问

Sean Moriarity  肖恩·莫里亚蒂

Introduction 介绍

In December, we introduced Bumblebee to the Elixir community. Bumblebee is a library for working with powerful pre-trained models directly in Elixir. The initial release of Bumblebee contains support for models like GPT2, Stable Diffusion, ConvNeXt, and more.
12 月,我们将 Bumblebee 引入了 Elixir 社区。 Bumblebee 是一个库,用于直接在 Elixir 中使用强大的预训练模型。 Bumblebee 的初始版本包含对 GPT2、Stable Diffusion、ConvNeXt 等模型的支持。

You can read some of my previous posts on Bumblebee here:
您可以在此处阅读我之前关于 Bumblebee 的一些帖子:

  1. Unlocking the power of transformers with Bumblebee
    用 Bumblebee 解锁变形金刚的力量
  2. Stable Diffusion with Bumblebee
    Bumblebee 的稳定扩散
  3. Semantic Search with Phoenix, Axon, Bumblebee, and ExFaiss
    使用 Phoenix、Axon、Bumblebee 和 ExFaiss 进行语义搜索

Since the introduction of Bumblebee, we’ve been working hard to improve the usability of the existing models, and also expand the number of tasks and models available for use. This includes support for additional models such as XLM-Roberta and CamemBert, as well as additional tasks such as zero-shot classification.
自推出 Bumblebee 以来,我们一直在努力提高现有模型的可用性,并扩展可供使用的任务和模型的数量。这包括对 XLM-Roberta 和 CamemBert 等其他模型的支持,以及零样本分类等其他任务。

Even more recently, we’ve added support for Whisper. Whisper is an audio-speech recognition model created by OpenAI capable of creating accurate transcription of audio in a variety of languages. In this post, I’ll go over the basics of Whisper, and describe how you can use it in your Elixir applications.
最近,我们添加了对 Whisper 的支持。 Whisper 是由 OpenAI 创建的音频语音识别模型,能够创建各种语言的准确音频转录。在这篇文章中,我将回顾 Whisper 的基础知识,并描述如何在您的 Elixir 应用程序中使用它。

What is Whisper? 什么是耳语?

Whisper is a deep learning model trained on over 680,000 hours of multi-lingual, multi-task audio data. To put that into perspective, Whisper was trained on about 77 years of audio to achieve state-of-the-art performance on a variety of transcription tasks.
Whisper 是一种深度学习模型,经过超过 680,000 小时的多语言、多任务音频数据训练。从这个角度来看,Whisper 接受了大约 77 年的音频培训,以在各种转录任务上实现最先进的性能。

Whisper is an audio-speech recognition model. The goal of audio-speech recognition is to translate spoken word into text. Audio-speech recognition is applicable to a variety of applications such as closed caption generation for videos and podcasts, or transcription of commands in speech enabled digital assistants such as Alexa and Siri.
Whisper 是一种音频语音识别模型。音频语音识别的目标是将口语翻译成文本。音频语音识别适用于多种应用,例如视频和播客的隐藏式字幕生成,或语音数字助理(如 Alexa 和 Siri)中命令的转录。

Audio-speech recognition is a challenging task simply due to the challenges of working with audio data. Compared to imagery or text, audio data is, quite literally, noisy. Most environments have some ambient background noise, which can make speech recognition challenging for models. Additionally, speech-recognition needs to be robust to accents–no matter how slight–as well as capable of handling discussions on a variety of topics and transcribing the unique vocabulary correctly.
由于处理音频数据的挑战,音频语音识别是一项具有挑战性的任务。与图像或文本相比,从字面上看,音频数据是嘈杂的。大多数环境都有一些环境背景噪声,这会使模型的语音识别具有挑战性。此外,语音识别需要对口音具有鲁棒性 —— 无论多么轻微 —— 并且能够处理关于各种主题的讨论并正确转录独特的词汇。

In addition to accents, background noise, and context, speech is also much more difficult to detect because of the lack of clear boundaries between words and sentences. In written language, there is often a clear separation between tokens in the form of whitespace or in distinct characters. In speech the lines are much blurrier–in English, the ends of words and sentences are often marked by inflections in speech and pauses, which are much more difficult for models to detect.
除了口音、背景噪音和上下文之外,由于单词和句子之间缺乏明确的界限,语音也更难检测。在书面语言中,标记之间通常以空格或不同字符的形式进行明确分隔。在语音中,线条更加模糊 —— 在英语中,单词和句子的结尾通常以语音和停顿的变化为标志,这对模型来说更难检测。

Whisper is a transformer model that consists of an audio encoder and a text-generating decoder. If you’re familiar with traditional transformer architectures such as BART, it’s very similar. Essentially, Whisper is designed to encode audio into some useful representation or embedding before decoding the representation into a sequence of tokens representing text. The key insight with Whisper lies in the quality and scale of the training data. Whisper proves robust to accents and is capable of recognizing jargon from a range of specialties precisely because it was trained on a diverse, large-scale dataset.
Whisper 是一种转换器模型,由音频编码器和文本生成解码器组成。如果您熟悉 BART 等传统变压器架构,就会发现它非常相似。本质上,Whisper 旨在将音频编码为一些有用的表示或嵌入,然后再将表示解码为表示文本的标记序列。 Whisper 的关键洞察力在于训练数据的质量和规模。事实证明,Whisper 对口音具有很强的鲁棒性,并且能够准确地识别一系列专业术语,因为它是在多样化的大规模数据集上进行训练的。

Using Whisper from Elixir
使用 Elixir 的 Whisper

Thanks to Bumblebee (and Paulo Valente and Jonatan Kłosko), you can use Whisper directly from Elixir. You’ll need to start by installing Bumblebee, Nx, and EXLA all from the main branch. Additionally, if you don’t want to design an audio-processing pipeline using Membrane or another multimedia framework, you’ll need to install ffmpeg. Bumblebee uses ffmpeg under the hood to process audio files into tensors.
感谢 Bumblebee(以及 Paulo Valente 和 Jonatan Kłosko),您可以直接从 Elixir 使用 Whisper。您需要先从主分支安装 Bumblebee、Nx 和 EXLA。此外,如果您不想使用 Membrane 或其他多媒体框架设计音频处理管道,则需要安装 ffmpeg。 Bumblebee 在底层使用 ffmpeg 将音频文件处理成张量。

Start by installing Bumblebee, Nx, and EXLA:
首先安装 Bumblebee、Nx 和 EXLA:

Mix.install([
  {:bumblebee, github: "elixir-nx/bumblebee"},
  {:exla, "~> 0.4"}
  {:nx, github: "elixir-nx/nx", sparse: "nx", override: true},
])

        
          
        
      
Nx.default_backend(EXLA.Backend)

        
          
        
      

Next, create a new audio-speech recognition serving using Bumblebee.Audio.speech_to_text/4. You will need to pass a variant of the Whisper model, a featurizer, and a tokenizer:
接下来,使用 Bumblebee.Audio.speech_to_text/4 创建一个新的语音识别服务。您将需要传递 Whisper 模型的一个变体、一个特征化器和一个分词器:

{:ok, whisper} = Bumblebee.load_model({:hf, "openai/whisper-tiny"})
{:ok, featurizer} = Bumblebee.load_featurizer({:hf, "openai/whisper-tiny"})
{:ok, tokenizer} = Bumblebee.load_tokenizer({:hf, "openai/whisper-tiny"})

serving =
  Bumblebee.Audio.speech_to_text(whisper, featurizer, tokenizer,
    max_new_tokens: 100,
    defn_options: [compiler: EXLA]
  )

        
          
        
      

This code will download the whisper-tiny checkpoint from OpenAI on HuggingFace. The featurizer is an audio featurizer that will process input audio signals into a normalized form recognized by the model. The tokenizer is the text tokenizer, which is used to convert between integer tokens and text.
此代码将从 HuggingFace 上的 OpenAI 下载 whisper-tiny 检查点。 featurizer 是一种音频 featurizer,它将把输入音频信号处理成模型可识别的规范化形式。 tokenizer是文本分词器,用于整数分词和文本之间的转换。

Now, you can pass audio files directly to Nx.Serving.run/2:
现在,您可以将音频文件直接传递给 Nx.Serving.run/2

Nx.Serving.run(serving, {:file, "thinking_elixir.mp3"})

        
          
        
      

And just like that you have a transcription of your audio file! Note that the file I uploaded was downloaded from the Thinking Elixir Podcast and ended up getting truncated. In order to transcribe longer audio clips, you need to chunk the audio file into sequences that fit into smaller chunks of time.
就这样,您就有了音频文件的转录本!请注意,我上传的文件是从 Thinking Elixir 播客下载的,最后被截断了。为了转录更长的音频剪辑,您需要将音频文件分块为适合更小时间块的序列。

Perhaps the coolest thing about Bumblebee is the range of possibilities it presents. There’s nothing stopping you from combining Whisper’s ASR capabilities with a summary model to summarize all of your favorite podcasts or Youtube videos. Or, you can run the transcription through a zero-shot classification model to turn the transcription into commands for a smart home assistant.
也许 Bumblebee 最酷的地方在于它提供的可能性范围。没有什么能阻止您将 Whisper 的 ASR 功能与摘要模型相结合来总结您最喜爱的播客或 Youtube 视频。或者,您可以通过零样本分类模型运行转录,将转录转化为智能家居助手的命令。

For example, you can run this transcription through a zero-shot model to determine the topic of the transcription:
例如,您可以通过零样本模型运行此转录以确定转录的主题:

{:ok, model} = Bumblebee.load_model({:hf, "facebook/bart-large-mnli"})
{:ok, tokenizer} = Bumblebee.load_tokenizer({:hf, "facebook/bart-large-mnli"})

labels = ["cooking", "programming", "dancing"]

zero_shot_serving =
  Bumblebee.Text.zero_shot_classification(
    model,
    tokenizer,
    labels,
    defn_options: [compiler: EXLA]
  )

{:file, "thinking_elixir.mp3"}
|> then(&Nx.Serving.run(serving, &1))
|> get_in([:results, Access.all(), :text])
|> then(&Nx.Serving.run(zero_shot_serving, &1))

        
          
        
      

Or you can use the transcription with a sentiment classification model:
或者您可以将转录与情感分类模型结合使用:

{:ok, model} = Bumblebee.load_model({:hf, "siebert/sentiment-roberta-large-english"})
{:ok, tokenizer} = Bumblebee.load_tokenizer({:hf, "roberta-large"})

text_classification_serving =
  Bumblebee.Text.text_classification(
    model,
    tokenizer,
    defn_options: [compiler: EXLA]
  )

{:file, "thinking_elixir.mp3"}
|> then(&Nx.Serving.run(serving, &1))
|> get_in([:results, Access.all()])
|> then(&Nx.Serving.run(text_classification_serving, &1))

        
          
        
      

Or even run the transcription through an NER pipeline to pull out the entities in the discussion:
或者甚至通过 NER 管道运行转录以提取讨论中的实体:

{:ok, model} = Bumblebee.load_model({:hf, "dslim/bert-base-NER"})
{:ok, tokenizer} = Bumblebee.load_tokenizer({:hf, "bert-base-cased"})

text_classification_serving =
  Bumblebee.Text.token_classification(
    model,
    tokenizer,
    aggregation: :same,
    defn_options: [compiler: EXLA]
  )

{:file, "thinking_elixir.mp3"}
|> then(&Nx.Serving.run(serving, &1))
|> get_in([:results])
|> Enum.map(& &1[:text])
|> then(&Nx.Serving.run(text_classification_serving, &1))

        
          
        
      

(Sorry Whisper messed up your names David and Cade!) With Bumblebee the possibilities are endless!
(抱歉 Whisper 弄乱了您的名字 David 和 Cade!)有了 Bumblebee,可能性是无限的!

Conclusion 结论

In this post, you learned how to take advantage of Bumblebee’s newest audio-speech recognition capabilities. Hopefully this inspires you with some ideas of the cool things you can build from scratch without leaving the comfort of Elixir.
在本文中,您了解了如何利用 Bumblebee 最新的语音识别功能。希望这能激发您一些想法,使您可以在不离开 Elixir 的舒适环境的情况下从头开始构建很酷的东西。

Until next time! 直到下一次!

Traditional Machine Learning with Scholar
与学者的传统机器学习

A black graduation cap and red tassle against a yellow background

Machine Learning Advisor
机器学习顾问

Sean Moriarity  肖恩·莫里亚蒂

Introduction 介绍

Most of my posts have been focused on Axon, Bumblebee, and deep learning in the Elixir ecosystem.
我的大部分帖子都集中在 Elixir 生态系统中的 Axon、Bumblebee 和深度学习。

A lot of early work in the Nx ecosystem centered around deep learning. While deep learning has resulted in incredible progress in many fields, there are a lot of cases where deep learning either doesn’t cut it, or just ends up being overkill for the task at hand.
Nx 生态系统中的许多早期工作都以深度学习为中心。虽然深度学习在许多领域取得了令人难以置信的进步,但在很多情况下,深度学习要么无法解决问题,要么最终对手头的任务造成了过度杀伤。

In particular, machine learning for forecasting and time-series analysis, as well as machine learning on tabular data (which comprises a significant amount of business intelligence applications) are two domains where deep learning has not yet superseded traditional approaches. Particularly for tabular or structured data, approaches based on gradient boosting often outperform significantly larger deep learning models.
特别是,用于预测和时间序列分析的机器学习,以及表格数据的机器学习(包括大量商业智能应用程序)是深度学习尚未取代传统方法的两个领域。特别是对于表格或结构化数据,基于梯度提升的方法通常优于更大的深度学习模型。

Aside from raw metrics, there are many other considerations when choosing a model to deploy in production. Even though a large transformer model might outperform a simple regression model, the regression model can run significantly faster on significantly cheaper hardware. Some traditional models also have an upper hand in terms of interpretability, which can make it easier to sell to business leaders when determining how to safely deploy a model to production.
除了原始指标外,在选择要在生产中部署的模型时还有许多其他考虑因素。尽管大型 Transformer 模型的性能可能优于简单的回归模型,但回归模型可以在便宜得多的硬件上运行得更快。一些传统模型在可解释性方面也有优势,在确定如何安全地将模型部署到生产中时,这可以更容易地向业务领导推销。

There are also arguments for why using deep learning over more traditional approaches actually leads to a simpler technical solution and, in the long term, reduced costs (technical and otherwise). Despite this, I would argue that you should choose the simplest tool (deep learning or not) to ship and move on from there.
还有人争论为什么使用深度学习而不是更传统的方法实际上会导致更简单的技术解决方案,并且从长远来看会降低成本(技术和其他方面)。尽管如此,我认为你应该选择最简单的工具(无论是否是深度学习)来发布并从那里继续前进。

In this post, I will walk you through the basics of traditional machine learning in Elixir with Scholar. This should give you a better idea of the other tools available to you in the ecosystem, as well as some areas we’re still lacking (and could use some help!)
在这篇文章中,我将带您了解 Elixir with Scholar 中传统机器学习的基础知识。这应该让您更好地了解生态系统中可用的其他工具,以及我们仍然缺乏的一些领域(并且可以使用一些帮助!)

What and Why is Scholar?
什么以及为什么是学者?

Scholar is a set of machine learning tools built on top of Nx. All of the library functions are implemented as numerical definitions, which means they can be JIT-compiled to both CPUs and GPUs. It also means you can use Scholar interchangeably with the other libraries in the Nx ecosystem.
Scholar 是一套建立在 Nx 之上的机器学习工具。所有库函数都作为数字定义实现,这意味着它们可以通过 JIT 编译到 CPU 和 GPU。这也意味着您可以将 Scholar 与 Nx 生态系统中的其他图书馆互换使用。

Scholar is meant to be a scikit-learn-like library. This means it offers implementations of non-deep-learning models like linear and logistic regression, k-nearest neighbors, k-means, and more. It also offers a plethora of machine learning utilities such as loss functions, distance functions, and more. That means Scholar might be useful to you, even if you’re working with Axon or Bumblebee.
Scholar 旨在成为一个类似 scikit-learn 的图书馆。这意味着它提供了非深度学习模型的实现,例如线性和逻辑回归、k 最近邻、k 均值等。它还提供了大量的机器学习实用程序,例如损失函数、距离函数等。这意味着 Scholar 可能对您有用,即使您正在使用 Axon 或 Bumblebee。

A lot of Scholars functionality, specifically some of the unsupervised bits, can also be useful in conjunction with Explorer and VegaLite for conducting exploratory data analysis.
许多学者功能,特别是一些无监督位,也可以与 Explorer 和 VegaLite 结合使用,用于进行探索性数据分析。

As of this writing, Scholar is still in pre-0.1 release, which means you might find some rough edges. There’s also still a lot of work left to fill the gap between Scholar and scikit-learn. If you’re interested in helping bring the Elixir ecosystem further along, contributions are welcome!
在撰写本文时,Scholar 仍处于 0.1 之前的版本,这意味着您可能会发现一些粗糙的边缘。填补 Scholar 和 scikit-learn 之间的空白还有很多工作要做。如果您有兴趣帮助进一步发展 Elixir 生态系统,欢迎贡献!

Linear Regression 线性回归

Perhaps the simplest machine learning model, and likely the first one anybody is introduced to is a linear regression model. Linear regression tries to fit a function y = mx + b (where m and x can be arbitrarily high dimensional). Linear regression is great because:
也许最简单的机器学习模型,也可能是第一个被介绍给任何人的模型是线性回归模型。线性回归尝试拟合函数 y = mx + b (其中 mx 可以是任意高维)。线性回归很棒,因为:

  1. It’s simple 这很简单
  2. It’s fast 它很快
  3. It’s interpretable 它是可解释的
  4. A lot of things can be modeled with a line
    很多东西都可以用一条线来建模

While a lot of relationships are not necessarily linear, you can go a really long way with a simple linear regression model. A (made up) rule of thumb is that 80% of machine learning problems can be solved reasonably well with a linear regression model.
虽然很多关系不一定是线性的,但您可以使用简单的线性回归模型走很长的路。一个(虚构的)经验法则是 80% 的机器学习问题可以用线性回归模型合理地解决。

With Scholar, you can fit a linear regression model in a few lines of code. Start by creating some synthetic data:
使用 Scholar,您可以在几行代码中拟合线性回归模型。首先创建一些合成数据:

m = :rand.uniform() * 10
b = :random.uniform() * 10

key = Nx.Random.key(42)
size = 100
{x, new_key} = Nx.Random.normal(key, 0.0, 1.0, shape: {size, 1})
{noise_x, new_key} = Nx.Random.normal(new_key, 0.0, 1.0, shape: {size, 1})
{noise_b, _} = Nx.Random.normal(new_key, 0.0, 1.0, shape: {size, 1})

y =
  m
  |> Nx.multiply(Nx.add(x, noise_x))
  |> Nx.add(b)
  |> Nx.add(noise_b)

:ok

        
          
        
      
:ok

This code block creates some synthetic data with a linear relationship. Rather than produce a perfectly straight line, we add some noise to ensure the relationship is not perfectly linear. This simulates what we’d see in real life a little more. You can visualize this data with VegaLite:
此代码块创建一些具有线性关系的合成数据。我们不是生成一条完美的直线,而是添加一些噪声以确保关系不是完美的线性。这更能模拟我们在现实生活中看到的东西。您可以使用 VegaLite 可视化此数据:

alias VegaLite, as: Vl

Vl.new(title: "Scatterplot Distribution", width: 720, height: 480)
|> Vl.data_from_values(%{
  x: Nx.to_flat_list(x),
  y: Nx.to_flat_list(y)
})
|> Vl.mark(:point)
|> Vl.encode_field(:x, "x", type: :quantitative)
|> Vl.encode_field(:y, "y", type: :quantitative)

        
          
        
      
null

Notice how the relationship contained in this scatter plot is relatively linear, but not perfectly so. This is much closer to what real-world data looks like.
请注意此散点图中包含的关系是如何相对线性的,但并非完全如此。这更接近于真实世界数据的样子。

The goal of linear regression is to find a line that captures the relationship present in the scatter plot above. You want to find the line that minimizes the total distance among all points in the training set. In practice, that means the line you draw might end up not intersecting any of the points in your dataset. The idea is that the curve fit in a linear regression model is a good average of generalization of the true relationship present.
线性回归的目标是找到一条线来捕捉上面散点图中存在的关系。您想要找到使训练集中所有点之间的总距离最小的直线。实际上,这意味着您绘制的线最终可能不会与数据集中的任何点相交。这个想法是线性回归模型中的曲线拟合是对当前真实关系的良好概括。

With Scholar, you can implement linear regression in a single line of code:
使用 Scholar,您可以在一行代码中实现线性回归:

model = Scholar.Linear.LinearRegression.fit(x, y)

        
          
        
      
%Scholar.Linear.LinearRegression{
  coefficients: #Nx.Tensor<
    f32[1][1]
    [
      [2.406812906265259]
    ]
  >,
  intercept: #Nx.Tensor<
    f32[1]
    [4.426097869873047]
  >
}

After running, you’ll see the output, which is a %LinearRegression{} struct containing the parameters of your model. In this case, the coefficients correspond to m and the intercept correspond to b. You can inspect both m and b to see how close the model was to correct:
运行后,您将看到输出,这是一个包含模型参数的 %LinearRegression{} 结构。在这种情况下, coefficients 对应于 mintercept 对应于 b 。您可以检查 mb 以查看模型与校正的接近程度:

IO.inspect(m)
IO.inspect(b)
:ok

        
          
        
      
2.181240456860072
4.435846174457203
:ok

Overall, not bad! You can truly see how well your model fits by visualizing it overlayed with your original data. First, you need to generate some predictions over the entire distribution. Notice that the graphic covers x values from -3.0 to 3.0, so you can generate 100 points in that range and predict their values using your model:
总的来说,还不错!通过可视化模型与原始数据的叠加,您可以真正了解模型的拟合程度。首先,您需要对整个分布生成一些预测。请注意,该图形涵盖了从 -3.0 到 3.0 的 x 值,因此您可以在该范围内生成 100 个点并使用您的模型预测它们的值:

pred_xs = Nx.linspace(-3.0, 3.0, n: 100) |> Nx.new_axis(-1)
pred_ys = Scholar.Linear.LinearRegression.predict(model, pred_xs)
:ok

        
          
        
      
:ok

Next, you can use VegaLite to overlay the predicted plots and the actual distribution on top of one another:
接下来,您可以使用 VegaLite 将预测图和实际分布相互叠加:

Vl.new(title: "Scatterplot Distribution and Fit Curve", width: 720, height: 480)
|> Vl.data_from_values(%{
  x: Nx.to_flat_list(x),
  y: Nx.to_flat_list(y),
  pred_x: Nx.to_flat_list(pred_xs),
  pred_y: Nx.to_flat_list(pred_ys)
})
|> Vl.layers([
  Vl.new()
  |> Vl.mark(:point)
  |> Vl.encode_field(:x, "x", type: :quantitative)
  |> Vl.encode_field(:y, "y", type: :quantitative),
  Vl.new()
  |> Vl.mark(:line)
  |> Vl.encode_field(:x, "pred_x", type: :quantitative)
  |> Vl.encode_field(:y, "pred_y", type: :quantitative)
])

        
          
        
      
null

Not bad! 不错!

Beyond Linear Regression 超越线性回归

Scholar supports a number of other machine learning algorithms including naive bayes, k-nearest neighbors, and logistic regression. In addition to machine learning algorithms, Scholar supports interpolation routines, principle component analysis (PCA), and distance functions. Scholar is pretty general purpose, but most of the APIs follow the same pattern as what you saw with the linear regression model.
Scholar 支持许多其他机器学习算法,包括朴素贝叶斯、k-最近邻和逻辑回归。除了机器学习算法,Scholar 还支持插值例程、主成分分析 (PCA) 和距离函数。 Scholar 非常通用,但大多数 API 遵循与您在线性回归模型中看到的相同的模式。

For example, if you wanted to create a binary-classifier using a logistic regression model, you’d use essentially the same two lines of code from the previous section to fit the model and make predictions. For a simple example, you can create synthetic data by binarizing your original data:
例如,如果您想使用逻辑回归模型创建一个二元分类器,您将使用与上一节基本相同的两行代码来拟合模型并进行预测。举一个简单的例子,您可以通过对原始数据进行二值化来创建合成数据:

binarized_y = Nx.greater(y, 5) |> Nx.squeeze()
:ok

        
          
        
      
:ok

This will convert all of your target data to be between 0 and 1. A logistic regression model is very similar to a linear regression model; however, it applies a logistic function on the output to squeeze the output between 0 and 1. This can be viewed as a class prediction. You can fit this model in the same way you fit your linear regression model:
这会将所有目标数据转换为 0 到 1 之间。逻辑回归模型与线性回归模型非常相似;但是,它在输出上应用逻辑函数以将输出压缩在 0 和 1 之间。这可以看作是类别预测。您可以采用与拟合线性回归模型相同的方式来拟合此模型:

model = Scholar.Linear.LogisticRegression.fit(x, binarized_y, num_classes: 2)

        
          
        
      
%Scholar.Linear.LogisticRegression{
  coefficients: #Nx.Tensor<
    f32[1]
    [1.0982909202575684]
  >,
  bias: #Nx.Tensor<
    f32
    -0.3180295526981354
  >,
  mode: :binary
}

Your coefficients are a bit different than your original model. You can check your work by computing your accuracy on the original dataset:
您的系数与原始模型有点不同。您可以通过计算原始数据集的准确性来检查您的工作:

pred_y = Scholar.Linear.LogisticRegression.predict(model, x)
Scholar.Metrics.accuracy(binarized_y, pred_y)

        
          
        
      
#Nx.Tensor<
  f32
  0.7300000190734863
>

So we get around a 73% accuracy. Not bad! Let’s try another type of model:
所以我们得到了大约 73% 的准确率。不错!让我们尝试另一种类型的模型:

model = Scholar.NaiveBayes.Gaussian.fit(x, binarized_y, num_classes: 2)
pred_y = Scholar.NaiveBayes.Gaussian.predict(model, x)
Scholar.Metrics.accuracy(binarized_y, pred_y)

        
          
        
      
#Nx.Tensor<
  f32
  0.7300000190734863
>

And even another:
甚至另一个:

model = Scholar.Neighbors.KNearestNeighbors.fit(x, binarized_y, num_classes: 2)
pred_y = Scholar.Neighbors.KNearestNeighbors.predict(model, x)
Scholar.Metrics.accuracy(binarized_y, pred_y)

        
          
        
      
#Nx.Tensor<
  f32
  0.7599999904632568
>

You should see how easy it is to quickly interchange model types. You don’t really need to have a deep understanding of machine learning to get started. You just need to know how to write some Elixir!
您应该看到快速交换模型类型是多么容易。您实际上不需要对机器学习有深入的了解才能开始。你只需要知道如何编写一些 Elixir!

Conclusion 结论

I hope this served as a short introduction to the Scholar library. There’s still a lot more to explore. As I mentioned before, contributions to Scholar are welcome. It’s a great way to get involved in the ecosystem. Implementing a function for Scholar is also a great way to get introduced to Nx itself!
我希望这是对 Scholar 图书馆的简短介绍。还有很多东西需要探索。正如我之前提到的,欢迎对 Scholar 做出贡献。这是参与生态系统的好方法。为 Scholar 实现一个功能也是介绍 Nx 本身的好方法!

Finally, I have to give a shout out to Mateusz Sluszniak who has done a lot of work on Scholar thus far.
最后,我要感谢 Mateusz Sluszniak,他迄今为止在 Scholar 方面做了很多工作。

Until next time! 直到下一次!

Ready to take your product to the next level with machine learning and Elixir? We’re ready to help. Get in touch today to find out how we can put the latest tech to use for you.
准备好通过机器学习和 Elixir 将您的产品提升到一个新的水平了吗?我们随时准备提供帮助。今天就联系我们,了解我们如何让最新技术为您所用。

Open-Source Elixir Alternatives to ChatGPT
ChatGPT 的开源 Elixir 替代品

Open-Source Elixir Alternatives to ChatGPT

Machine Learning Advisor
机器学习顾问

Sean Moriarity  肖恩·莫里亚蒂

Introduction 介绍

In the last few months, large-language models (LLMs) have taken the tech world by storm. The performance of OpenAI’s chat-based models, ChatGPT (GPT 3.5 Turbo) and GPT-4, is a significant step up from the last generation of models. There are several reasons for this significant jump in performance; however, the most significant factor is the careful curation of data for instruction-following tasks, as well as a large amount of training based on human feedback.
在过去的几个月里,大型语言模型 (LLM) 席卷了科技界。 OpenAI 基于聊天的模型 ChatGPT (GPT 3.5 Turbo) 和 GPT-4 的性能比上一代模型有了显着提升。性能显着提升的原因有很多;然而,最重要的因素是为指令跟踪任务仔细管理数据,以及基于人类反馈的大量培训。

Large-language models like ChatGPT are first pre-trained on a large amount of text, then instruction-tuned on task-specific data, and finally, sometimes they’re trained with reinforcement learning on human feedback (RLHF). These innovations produce models that are high-quality and capable of following instructions to complete a variety of tasks.
像 ChatGPT 这样的大型语言模型首先在大量文本上进行预训练,然后在特定任务数据上进行指令调整,最后,有时它们会在人类反馈 (RLHF) 上进行强化学习训练。这些创新产生了高质量的模型,能够按照说明完成各种任务。

While it seems at this moment that OpenAI’s proprietary models are a step above the rest, the open-source landscape is quickly closing the gap. And, thanks to Nx and Bumblebee, you can take advantage of many of the open-source ChatGPT competitors right now. In this post, we’ll talk about some of the strongest open-source competitors to ChatGPT, and how you can use them in Elixir.
虽然目前看来 OpenAI 的专有模型比其他模型高出一步,但开源领域正在迅速缩小差距。而且,感谢 Nx 和 Bumblebee,您现在可以利用许多开源 ChatGPT 竞争对手。在这篇文章中,我们将讨论 ChatGPT 的一些最强大的开源竞争对手,以及如何在 Elixir 中使用它们。

Why Should I Use an Open-Source Model?
为什么要使用开源模型?

A common question that pops up when designing LLM-powered applications is why use open-source over proprietary models? Considering the performance of OpenAI’s offerings compared to open-source alternatives, it can be difficult to justify investing the time and effort into an LLM deployment. Machine learning deployments are difficult, and LLM deployments take this to an extreme. Of course, that doesn’t mean you should just blindly throw GPT-4 at any problem you have.
在设计 LLM 驱动的应用程序时弹出的一个常见问题是为什么使用开源而不是专有模型?考虑到 OpenAI 产品与开源替代方案相比的性能,很难证明将时间和精力投入到 LLM 部署中是合理的。机器学习部署很困难,LLM 部署将其发挥到了极致。当然,这并不意味着你应该在遇到任何问题时盲目地使用 GPT-4。

There are several reasons you may want to consider using open-source:
您可能出于以下几个原因考虑使用开源:

Data Privacy 数据隐私

One concern when using OpenAI’s (and other providers’) API is data privacy. Depending on your business use case, it may be unacceptable to send data to an external provider. Using an open-source model gives you control of the entire stack. You can work with proprietary and sensitive data without privacy concerns.
使用 OpenAI(和其他提供商)的 API 时的一个问题是数据隐私。根据您的业务用例,将数据发送到外部提供商可能是不可接受的。使用开源模型可以让您控制整个堆栈。您可以使用专有和敏感数据而无需担心隐私问题。

Latency 潜伏

Depending on your specific use case, the latency of OpenAI’s API might be unacceptable. While GPT-3.5 Turbo provides great performance relative to latency, it may still not be fast enough to meet your needs. Fine-tuning a smaller model on task-specific data and avoiding an additional network call may prove a better option.
根据您的具体用例,OpenAI 的 API 的延迟可能是不可接受的。虽然 GPT-3.5 Turbo 提供了与延迟相关的出色性能,但它可能仍然不够快,无法满足您的需求。对特定于任务的数据微调较小的模型并避免额外的网络调用可能是更好的选择。

Task-Specific Performance
特定任务的性能

GPT-3.5 and GPT-4 have great performance on zero-shot tasks. That means you can go very far with just some careful prompting. With context injection via retrieval, GPT-3.5 and GPT-4 can effectively solve a wide range of tasks. That being said, fine-tuned models remain at the pinnacle of task-specific performance. If you have a specialized use case, and enough data and time to fine-tune a specialized model, you can achieve competitive or better performance than proprietary models.
GPT-3.5 和 GPT-4 在零样本任务上表现出色。这意味着您只需一些小心的提示就可以走得很远。通过检索上下文注入,GPT-3.5 和 GPT-4 可以有效地解决范围广泛的任务。话虽这么说,微调模型仍然处于特定任务性能的顶峰。如果您有一个专门的用例,并且有足够的数据和时间来微调一个专门的模型,您可以获得比专有模型更具竞争力或更好的性能。

Cost 成本

A final consideration when deciding between open-source and proprietary models is cost. GPT-4 is a powerful model; however, that power comes at a cost. GPT-3.5 Turbo is much cheaper; however, the performance might be unacceptable for your specific task.
在开源和专有模型之间做出决定时的最后一个考虑因素是成本。 GPT-4 是一个强大的模型;然而,这种力量是有代价的。 GPT-3.5 Turbo 便宜得多;但是,性能对于您的特定任务来说可能是不可接受的。

Flan-T5
法兰-T5

Flan-T5 is a set of model checkpoints released from Google’s paper Scaling Instruction-Finetuned Language Models. Flan-T5 is a variant of the T5 architecture finetuned on a mixture of tasks. Specifically, Flan-T5 is instruction tuned on a wide variety of tasks. This finetuning process yields a model with competitive to state-of-the-art performance on a number of tasks.
Flan-T5 是谷歌论文 Scaling Instruction-Finetuned Language Models 发布的一组模型检查点。 Flan-T5 是 T5 架构的变体,针对混合任务进行了微调。具体来说,Flan-T5 是针对各种任务调整的指令。这个微调过程产生了一个模型,在许多任务上具有与最先进性能相媲美的性能。

Flan-T5 is one of multiple models you can use in Bumblebee. The most competitive checkpoint is flan-t5-xxl; however, a finetuned flan-t5-xl can be competitive as well. flan-t5-xxl will require a large GPU as just the checkpoint parameters are around 45GB. To use flan-t5 for text generation, you can use Bumblebee to load both the tokenizer and model:
Flan-T5 是您可以在 Bumblebee 中使用的多种模型之一。最具竞争力的检查点是 flan-t5-xxl ;然而,经过微调的 flan-t5-xl 也可以具有竞争力。 flan-t5-xxl 将需要一个大的 GPU,因为检查点参数大约为 45GB。要使用 flan-t5 生成文本,您可以使用 Bumblebee 来加载分词器和模型:

Nx.default_backend(EXLA.Backend)

{:ok, model} = Bumblebee.load_model({:hf, "google/flan-t5-xl"})
{:ok, tokenizer} = Bumblebee.load_tokenizer({:hf, "google/flan-t5-xl"})

        
          
        
      

Then you can wrap the model in a generation serving:
然后您可以将模型包装在一代服务中:

serving = Bumblebee.Text.generation(model, tokenizer, defn_options: [compiler: EXLA])

        
          
        
      

And create generations:
并创造世代:

Nx.Serving.run(serving, "Elixir is a")

        
          
        
      

And you will see:
你会看到:

mystical or magical item used to enhance the powers of a person or animal.

        
          
        
      

Llama and Friends 骆驼和朋友们

Llama is a recent, popular open-source alternative to ChatGPT. Llama is a large-language model from Facebook. It is not instruction-tuned or trained on human feedback; however, there are a number of variants that have been finetuned on an instruction-specific dataset called Alpaca. These finetuned variants achieve competitive performance to ChatGPT and have taken off in popularity due to their performance.
Llama 是最近流行的 ChatGPT 开源替代品。 Llama 是来自 Facebook 的大语种模型。它不是根据人类反馈进行指令调整或训练的;然而,有许多变体已经在名为 Alpaca 的特定指令数据集上进行了微调。这些经过微调的变体实现了与 ChatGPT 竞争的性能,并因其性能而流行起来。

One issue with Llama is its restrictive license. Llama and its weights were initially released to academics and other researchers with a license that restricted commercial use. After a leak, Llama and its variants have more or less popped up everywhere; however, its use is still restricted to non-commercial purposes.
Llama 的一个问题是它的限制性许可。 Llama 及其权重最初是通过限制商业用途的许可证发布给学术界和其他研究人员的。泄漏后,Llama 及其变体或多或少地出现在各处;但是,它的使用仍然仅限于非商业目的。

You can use Llama today in the same way you’d use any other Bumblebee model:
您现在可以像使用任何其他 Bumblebee 模型一样使用 Llama:

Nx.default_backend(EXLA.Backend)
{:ok, model} = Bumblebee.load_model({:hf, "decapoda-research/llama-7b-hf"})
{:ok, tokenizer} = Bumblebee.load_tokenizer({:hf, "decapoda-research/llama-7b-hf"})

        
          
        
      

Then you can wrap the model in a generation serving:
然后您可以将模型包装在一代服务中:

serving = Bumblebee.Text.generation(model, tokenizer, defn_options: [compiler: EXLA])

        
          
        
      

And create generations:
并创造世代:

Nx.Serving.run(serving, "Elixir is a")

        
          
        
      

And you will see:
你会看到:

mystical or magical item used to enhance the powers of a person or animal.

        
          
        
      

OpenAssistant 打开助手

The OpenAssistant project is an attempt at replicating ChatGPT and other chat models through a coordinated open-source data collection and model training process. Users can navigate to the OpenAssistant website and participate in the process of labeling data for training. The OpenAssistant project has been continuously releasing models with open licenses. Their recent model is a Pythia model finetuned on data collected for the project. This model is based on the GPT-NeoX architecture from EleutherAI.
OpenAssistant 项目试图通过协调的开源数据收集和模型训练过程来复制 ChatGPT 和其他聊天模型。用户可以导航到 OpenAssistant 网站并参与标记数据以进行训练的过程。 OpenAssistant 项目一直在不断发布具有开放许可证的模型。他们最近的模型是根据为该项目收集的数据微调的 Pythia 模型。该模型基于 EleutherAI 的 GPT-NeoX 架构。

Again, you can use the OpenAssistant Pythia model today using Bumblebee:
同样,您现在可以使用 Bumblebee 使用 OpenAssistant Pythia 模型:

Nx.default_backend(EXLA.Backend)

{:ok, model} = Bumblebee.load_model({:hf, "OpenAssistant/oasst-sft-1-pythia-12b"})
{:ok, tokenizer} = Bumblebee.load_tokenizer({:hf, "OpenAssistant/oasst-sft-1-pythia-12b"})

        
          
        
      

Then you can wrap the model in a generation serving:
然后您可以将模型包装在一代服务中:

serving = Bumblebee.Text.generation(model, tokenizer, defn_options: [compiler: EXLA])

        
          
        
      

One of the interesting things about OpenAssistant models is that they use special tokens to mark assistant and prompter portions of a conversation. They follow the chat-centric paradigm first introduced by ChatGPT:
关于 OpenAssistant 模型的有趣之处之一是它们使用特殊标记来标记对话的助手和提示部分。他们遵循 ChatGPT 首次引入的以聊天为中心的范例:

Nx.Serving.run(serving, "<|prompter|>Elixir is a<|endoftext|><|assistant|>")

        
          
        
      

Notice you need to include <|prompter|> and <|assistant|> tokens before the generation. After running this you will see:
请注意,您需要在生成之前包含 <|prompter|><|assistant|> 标记。运行后你会看到:

A programming language that is high-level, functional, and declarative.

        
          
        
      

Conclusion 结论

The landscape of open-source large-language models is rapidly growing. As the open-source landscape becomes more competitive with OpenAI’s models, it will make more and more sense to migrate away from proprietary models and closed APIs. The beauty of the Elixir Nx ecosystem is that you can migrate to these open-source alternatives seamlessly. You can use LLMs today directly within your Elixir applications. The next generation of apps is LLM-powered, and I believe Elixir is the language of the LLM-powered future.
开源大语言模型的前景正在迅速发展。随着开源环境与 OpenAI 模型的竞争越来越激烈,从专有模型和封闭的 API 迁移将变得越来越有意义。 Elixir Nx 生态系统的美妙之处在于您可以无缝迁移到这些开源替代方案。现在,您可以直接在 Elixir 应用程序中使用 LLM。下一代应用程序由 LLM 驱动,我相信 Elixir 是 LLM 驱动的未来语言。

Ready to find out how DockYard can put the latest Elixir innovations to work for you? Contact us today.
准备好了解 DockYard 如何让最新的 Elixir 创新为您所用?今天联系我们。




Narwin holding a press release sheet while opening the DockYard brand kit box